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We present a new approach to showing that random graphs are nearly optimal expanders. 
^ C| This approach is based on recent deep results in combinatorial group theory. It applies to both 

regular and irregular random graphs. 

Let r be a random d-regular graph on n vertices, and let A be the largest absolute value 
of a non-trivial eigenvalue of its adjacency matrix. It was conjectured by Alon |Alo86| that 
a random d-regular graph is "almost Ramanujan", in the following sense: for every e > 0, 
' A < 2\/d — 1 + e a.a.s. Friedman famously presented a proof of this conjecture in |Fri08] . Here 

, we suggest a new, substantially simpler proof of a nearly-optimal result: we show that for d 

V^Q ■ even, a random d-regular graph satisfies A < 2\J d — 1 + 1 asymptotically almost surely. 

A main advantage of our approach is that it is applicable to a generalized conjecture: A 
CN , d-regular graph on n vertices is an n-covering space of a bouquet of d/2 loops. More generally. 

fixing an arbitrary base graph Q, we study the spectrum of T, a random n-covering of fi. Let 
£\j | A be the largest absolute value of a non-trivial eigenvalue of T. Extending Alon's conjecture to 

this more general model, Friedman |Fri03| conjectured that for every e > 0, a.a.s. A < p + e, 
CN ' where p is the spectral radius of the universal cover of SI. When Q, is regular we get the same 

bound as before: p + 1, and for an arbitrary $7, we prove a nearly optimal upper bound of 
y/3p. This is a substantial improvement upon all known results (by Friedman, Linial-Puder, 
Lubetzky-Sudakov-Vu and Addario-Berry-Griffiths). 
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1 Introduction 

Random (/-regular graphs 

Let r be a finite c?-regular graphQ on n vertices (d > 3) and let Ar be its adjacency matrix. The 
spectrum of T is the spectrum of Ar and consists of n real eigenvalues, 

d = Ai > A 2 > . • • > A„ > -d. 

The eigenvalue Ai = d corresponds to constant functions and is considered the trivial eigenvalue of 

T. Let A (L) be the largest absolute value of a non-trivial eigenvalue of L, i.e. A (r) = max {A2, — A„} . A (r) 

This value measures the spectral expansion of the graph: the smaller A (L) is, the better expander 

L is (see Appendix [B] for details). 

The well-known Alon-Boppana bound states that A (L) > 2\Jd — 1 — o„ (1) ([N il91| ). bounding 
the spectral expansion of an infinite family of d-regular graphs. There is no equivalent deterministic 
non-trivial upper bound: for example, if L is disconnected or bipartite then A (T) = d. However, 
Alon conjectured |Alo861 Conj. 5.1] that if L is a random d-regular graph, then A (L) < 2y 'd — 1 + 
o n (1) a.a.s. (asymptotically almost surely, i.e. with probability tending to 1 as n — > 00) 0. 

Since then, a series of papers have dealt with this conjecture. One approach, due to Kahn and 
Szemeredi, studies the Rayleigh quotient of the adjacency matrix Ar and shows that it is likely 
to be small on all points of an appropriate e-net on the unit sphere. This approach yielded an 
asymptotic bound of A (L) < c\fd for some unspecified constant c |FKS89| . In the recent work 
[DJPPlfl Thm. 26], it is shown that this bound can be taken to be 10 4 . Other works, as well as 
the current paper, are based on the idea of the trace method, which amounts to bounding A (r) by 
means of counting closed paths in T. These works include |BS87j . in which Broder and Shamir show 
that a.a.s. A (F) <y/2d 3/li + e (Ve > 0); |Fri91| where Friedman obtains A (r) < 2^/d - l + 21og cZ + c 
a.a.s.; and, most famously, Friedman's 100-page-long proof of Alon's conjecture |Fri08| . Friedman 
shows that for every e > 0, A (r) < 2\/d — 1 + e a.a.s. 

In the current paper we prove a result which is slightly weaker than Friedman's. However, the 
proof we present is substantially shorter and simpler then the sophisticated proof in |Fri0 8i . We 
show the following: 

Theorem 1.1. For d even, let T be a random d-regular simple graph on n vertices chosen at 
uniform distribution. Then 

A(r) < 2Vd- 1 + 0.84 

asymptotically almost surel^. 

The same result holds also for random cZ-regular graphs in the permutation model (see below). 
In fact, we first prove the result stated in Theorem 11.11 for random graphs in this model. The 

t Unless otherwise specified, a graph in this paper is undirected and may contain loops and multiple edges. A 
graph without loops and without multiple edges is called here simple. 

tin fact, Alon's original conjecture referred only to A2 (T), the second largest eigenvalue. 
§For small d's a better bound is attainable - see Remark 16.31 
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derivation of Theorem 11.11 for the uniform model is then immediate by results of Wormald |Wor99| 
and Greenhill et al. [GJKW02] showing the contiguit^\ of different models of random regular graphs 
(see Appendix |A"|) . 

In the permutation model, which we denote by "P n ,d% a random d-regular graph Y on the set V n ,d 
of vertices [n] is obtained by choosing independently and uniformly at random | permutations 
(Ti, . . . , <7d in the symmetric group S n , and introducing an edge (v,aj (v)) for every v G [n] and 

j G {l, . . . , 1} (and hence the restriction to d even). Of course, V may be disconnected and can 
have loops or multiple edges. This model applies only to even values of d. 

We stress that even after Alon's conjecture is established, many open questions remain con- 
cerning A(r). In fact, very little is known about the distribution of A(r). A major open question 
is the following: what is the probability that a random d-regular graph is Ramanujan, i.e. that 
A (r) < 2y/d — 1? There is experimental evidence that this probability tends to 27% as n grows 
[MNS08J. However, even the following, much weaker question is not known: are there infinitely 
many Ramanujan d-regular graphs for every d > 3? The only positive results here are by explicit 
constructions of Ramanujan graphs when d — 1 is a prime power by [LPS88, Mar88, Mor94j. We 
hope our new approach may eventually contribute to answering these open questions. 

Random coverings of a fixed base graph 

The hidden reason for the number 2V d — 1 in Alon's conjecture and Alon-Boppana Theorem is 
the following: All finite e?-regular graphs are covered by the d-regular (infinite) tree T = Td- Let 
A T : £ 2 (V (T)) -> I 2 (V (T)) be the adjacency operator of the tree, defined by 

(Arf) (u) = £/(«)■ 

Then At is a self-adjoint operator and, as firstly proven by Kesten |Kes59| . the spectrum of At 
is [— 2y/d — 1, 2-s/d — lj . Namely, 2y/d — 1 is the spectral radius^ of At- In this respect, among 
all possible (finite) quotients of the tree, Ramanujan graphs are "ideal", having their non-trivial 
spectrum as good as the "ideal object" T. 

It is therefore natural to measure the spectrum of any graph T against the spectral radius of 
its covering tree. Several authors call graphs whose non-trivial spectrum is bounded by this value 
Ramanujan, generalizing the regular case. Many of the results and questions regarding the spectrum 
of d-regular graphs extend to this general case. An analogue of Alon-Boppana's Theorem is given 
in Proposition 1 1.21 and Alon's conjecture on almost-Ramanujan graphs is generalized in Conjecture 
11.31 below. Furthermore, one can ask if a given tree has infinitely many Ramanujan quotients, and 
it turns out that there exist trees which have none at all |LN98| . 

We now describe a generalization of the permutation model for random regular graphs, which 
generates families of graphs with a common universal covering tree. A random graph T in the 
permutation model V n .d can be equivalently thought of as a random n-sheeted covering space of 
the bouquet with ^ loops. In a similar fashion, fix a finite, connected base graph Vl, and let T be 
a random n-covering space of Q. More specifically, T is sampled as follows: its set of vertices is 
V (f2) x [n\. A permutation a e G S n is then chosen uniformly and independently at random for 
every edge e = (it, v) of f2, and for every i G [n] the edge ((tt,i) , (v, cr e (i))) is introduced in 
We denote this model by C n ,n (so that C n .B d = V n ,di where Bd is the bouquet with | loops). For C n ,n 

2 2 

example, all bipartite d-regular graphs on 2n vertices cover the graph with two vertices and 

^Two models of random graphs are contiguous if the following holds: (i) for every (relevant) n they define 
distributions on the same set of graphs on n vertices, and (ii) whenever a sequence of events has probability 1 — o n (1) 
in one distribution, it has a probability of 1 — o n (1) in the other distribution as well. 

tThe spectral radius of an operator M is defined as sup {| A| | A 6 Spec M}. 

§ We stress that we consider undirected edges. Although one should first choose an arbitrary orientation for each 
edge in order to construct the random covering, the orientation does not impact the resulting probability space. 
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d edges connecting them. Various properties of random graphs in the C n $i model were thoroughly 
examined over the last decade (e.g. |AL02[ |ATM02l IFri031 ILR051 IAL061 IBL061 ILTTO] ). From now 
on, by a "random n-covering of 0" we shall mean a random graph in the model C„,q. 

A word about the spectrum of a non-regular graph is due. In the case of cf-regular graphs 
we have considered the spectrum of the adjacency operator. In the general case, it is not apriori 
clear which operator best describes in spectral terms the properties of the graph. In this paper we 
consider two operators: the adjacency operator Ar defined as above, and the Markov operator Mr Ar 
defined by Mr 

deg (u) 



(A third possible operator is the Laplacian - see Appendix |B|) With a suitable inner product, each 
of these operators is self-adjoint and therefore admits a real spectrum (see Appendix [B] for the 
relations of these spectra to expansion properties of T). 

For a finite graph O on m vertices, the spectrum of the adjacency matrix Aq is 

pf(fi) = Ai > ... >A m > -pf(fi), 

pf (il) being the Perron- Frobenius eigenvalue of Ar- The spectrum of Mo. is pf (£1) 

1 = «i > ■ ■ ■ > Mm > -1, 

the eigenvalue 1 corresponding to the constant function. Every finite covering r of Q shares the same 
Perron- Frobenius eigenvalue, and moreover, inherits the entire spectrum of O (with multiplicity): 
Let 7r : r — > O be the covering map, sending the vertex (v,i) to v and the edge ((u,i) , {v,j)) to 
{u,v). Every eigenfunction / : V (il) — > C of any operator on I 2 (V (fl)) as above, can be pulled 
back to an eigenfunction of T, / o 7r, with the same eigenvalue. Thus, every eigenvalue of f2 (with 
multiplicity) is trivially an eigenvalue of T as well. We denote by Xa (r) the largest absolute value Xa (T) 
of a new eigenvalue of ^4r, namely the largest one not inherited from Q. Equivalently, this is the 
largest absolute eigenvalue of an eigenfunction of T which sums to zero on every fiber of tt. In a 
similar fashion we define Am (r), the largest absolute value of a new eigenvalue of Mr. Note that Am (r) 
in the regular case (i.e. when f2 is d-regular), Ar = d ■ Mr, and so Aa (r) = d ■ Am (r). Moreover, 
when ft = Bd is the bouquet, A^ (r) = A (r). 

As in the regular case, the largest non-trivial eigenvalue is closely related to the spectral radius 
of T, the universal covering tree of f2 (which is also the universal covering of every covering T of il). 
We denote by pa {Q) and pu (S^) the spectral radii of At and Mr, resp. (So when is <i-regular, pa (£1) , Pm (^) 
Pa (^) = d ■ pm (Q) = 2y/d — 1.) First, there are parallels of Alon-Boppana's bound in this more 
general scenario. The first part of the following proposition is due to Greenberg, while the second 
one is due to Burger: 

Proposition 1.2. Let T be an n-covering oftt. Then 

(1) X A (r) > P A (0) - o n (1) \Gre95[ Thm 2.11]. 

(2) Am (r) > p M (Q) - o n (1) fBHrW[ [CZM Prop. 6]. 

When H is <i-regular (but not necessarily a bouquet), Greenberg's result was first observed by 
Serre [Ser90] , 

As in the d-regular case, the only deterministic upper bounds are trivial: A^ (r) < pf (Q) and 
Am (r) 5; 1- But there are interesting probabilistic phenomena. The following conjecture is the 
natural extension of Alon's conjecture. The adjacency-operator version is due to Friedman |Fri03| . 
We extend it to the Markov operator M as well: 

Conjecture 1.3 (Friedman, |Fri03j ) . Let f2 be a finite connected graph. IfT is a random n-covering 
of fl, then for every £ > 0. 

X A (r) < p A (0) + e 
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asymptotically almost surely, and likewise 



Am (T) < p M (fi) + e 

asymptotically almost surely. 

Since A^ (r) and Am (r) provide an indication for the quality of expansion of Y (see Appendix 
IB)) , Conjecture 11.31 asserts that if the base graph is a good (nearly optimal) expander then with 
high probability so is its random covering T. 

In the same paper ( [Fri03| ). Friedman generalizes the method of Broder-Shamir mentioned above 
and shows that A^ (r) < pf (fi) 1//2 pa (fi) 1 / 2 + e a.a.s. An easy variation on his proof gives Am (r) < 
PM (fi^^+e a.a.s. In |LP10| . Linial and the author improve this to x A (r) < 3pf (fi) 1/3 p A (nf /3 +e 
(and with the same technique one can show A a/ (r) < 3pM (SI) 2 / 3 + e). This is the best known 
result for the general case prior to the current work. 

Several works studied the special case where the base-graph fi is (i-regular (recall that in this case 
Xa (r) = d ■ Am (r)). Lubetzky, Sudakov and Vu |LSV11| find a sophisticated improvement of the 
Kalm-Szemeredi approach and prove that a.a.s. Xa (r) < C-max (A (fi) , p A (fi)) dog pa (fi) for some 
unspecified constant C (since fi here is d-regular , pa (fi) = 2\/d — 1). An asymptotically better 
bound of 430, 656 Vd is given by Addario-Berry and Griffiths |ABG10| , by further ameliorating the 
same basic technique (note that this bound becomes meaningful only for d > 430, 656 2 ). 

The following theorems differ from Conjecture 11.31 only by a small additive or multiplicative 
factor, and are nearly optimal by Proposition 11.21 They pose a substantial improvement upon all 
former results, both in the special case of a c?-regular base-graph fi and, to a larger extent, in the 
general case of any finite base-graph. 

Theorem 1.4. Let fi be an arbitrary finite connected graph, and let T be a random n-covering of 
fi. Then for every e > 0, 

X A (T) < V3 • p A (fi) + e 
asymptotically almost surely, and similarly 

Am (T) < V3 • p M (fi) + e 

asymptotically almost surely. 

For the special case where fi is regular, we obtain the same bound as in the case of the bouquet 
(Theorem EE}: 

Theorem 1.5. Let fi be a finite connected d-regular graph (d > 3) and let T be a random n-covering 
of fi. Then for every e > 0, 

X A (r) < p A (fi) + 0.84 = 2Vd-l + 0.84 

asymptotically almost surely. 

We stress the following special case concerning random bipartite d-regular graphs. It follows 
as all bipartite regular graphs cover the graph fi consisting of two vertices and d edges connecting 
them. 

Corollary 1.6. Let T be a random bipartite d-regular graph on n vertices (d>3). Then 

X A (r) < 2Vd~ 1 + 0.84 
asymptotically almost surely (as n ooj^\. 

+ A gain, for small values of d a better bound is reachable - see Remark 16.31 
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(This means that alongside the two trivial eigenvalues ±d, all other eigenvalues of the bipartite 
graph r are a.a.s. within [— 2y/d — 1 — 0.84, 2y/d — 1 + 0.84] . The result applies also to random 
simple bipartite regular graphs: see appendix |A"1) 

To put Theorems 11.11 H~4l and [TTSI in context, the following table summarizes the results men- 
tioned above for the different cases in question, with respect to the adjacency operator Ar- As 
before, SI is the connected base-graph and p = pa (fi) is the spectral radius of its universal covering p 



tree. The results are ordered by their asymptotic behavior. 



The 
base-graph SI 


Any graph 


d-regular 


= a bouquet of 
| loops 






p = 2y/d- 1 


p = 2y/d- 1 


Deterministic 
lower bound 

for x A (r) 


P~ o n (1) 
[Gre95| 


p- o n (1) 
|Ser90| 


p - 0„ (1) 
(Alon-Boppana) 
|Nil91j 


Conjectured 
probabilistic 
upper bound 


p + e |Fri03| 


p + e 
|Alo86| 


Probabilistic 
upper bounds, 
ordered by 
asymptotic 
strength for 
growing p 


y/pf (SI) p + e _ 
|Fri03| 


=+ \J~dp~ + e 


yfdp + e |BS87] 


3-pf(^) i/a p 2 / 3 + £ 
|LP10j 


=> 3 • dV3p2/3 + £ 






C ■ max (A (fi) , p) log p 
[LSVllj 






265,000 • p 
[ABCilO] 


6,200 • p 
|FKS89MD,TPP11| 


V3 • p + e 
(ThmfOl) 










p + 2 log d + c 
|Fri9lj 




p + 0.84 
(ThmHH]) 


p + 0.84 
(ThmHHD 






p + e [FriOSj 



Finally, let us stress that alongside the different models for random d-regular graphs, random 
coverings of a fixed, good expander, are probably the most natural other source for random, good 
expanders ("good" expanders are sparse graphs with high quality of expansion). Other known 
models for random graphs do not necessarily have this property. For example, the Erdos-Renyi 
model G(n,p), fails to produce good expander graphs: when p is small (O (^)) the generic graph 
is not an expander (due, e.g., to lack of connectivity), whereas for larger values of p, the average 
degree grows unboundedly. 

2 Overview of the Proof 

In this section we present the outline of the proof of Theorems 11.11 11.41 and 11.51 (only the spectrum 
of the adjacency operator is considered in this section). We assume the reader has some familiarity 
with free groups, although we recall the basic definitions and classical relevant results throughout 
the text. For a good exposition of free groups and combinatorial group theory we refer the reader 
to |Bog08| . 
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Step I: The trace method 

Let fl be a fixed base graph with k edges and T a random ri-covering in the model C n ,n- In the 

spirit of the trace method, the spectrum of T is analyzed by counting closed paths. More concretely, 

denote by CVt (T) the set of closed paths of edge-length t in F. If Spec (Ar) denotes the multiset CVt (r) 

of eigenvalues of Ar, then for every t £ N, Spec (Ar) 

J2 ^=t?{Al) = \CV t (T)\. 

/j{ESpec(A r ) 

Orient each of the k edges of H arbitrarily, label them by x\, . . . , Xk and let X = {x\, . . . , Xk}- Let 
<7i, . . . , CTfc G S n denote the random permutations by which F is defined: for each edge Xj = (u,v) 
of f2 and each i £ [n], T has an edge ((u,i) , (v,(Jj («)))• Note that every closed path in V projects 
to a closed path in f2. Thus, instead of counting directly closed paths in T, one can count, for every 
closed path in f2, the number of closed paths in T projecting onto it. 

Let W = x e ^ . . . x E j* £ CVt (fi) Q [X. U X -1 ) be a closed path in the base graph f2, beginning 
(and terminating) at some vertex v £ V(Q). (Here = ±1 and xj 1 means the path traverses 
the edge Xj in the opposite orientation.) For every i £ [n] there is a unique lift of w to some path 
in r, not necessarily closed, which begins at the vertex (v,i). This lifted path terminates at the 
vertex (v,j), where j is obtained as follows: let w . . . , Cfc) denote the permutation obtained 
by composing ai, . . . ,<Jk according to w, namely, w (cti, . . . , <Jk) = ■ ■ ■ Ojl £ S n . Then j is the 
image of i under this permutation: j = w (u\, . . . ,<Tfc) (i) = a £ ^ . . . er^* (ijll Thus, the i-th lift of w 
is a closed path if and only if i is a fixed point of the permutation w (ci, . . . , <Jk), and the number 
of closed paths in T projecting onto w is equal to the number of fixed points of w (<7i, . . . We 
denote this number by F w<n = T w , n (fi, • • • , ffe)- T w>n 

Claim 2.1. For every even t £ N, 

e[ax(t)*]< J2 [e[^,„]-i]- (2.1) 

wecv t (Q) 

(The expectation on the l.h.s. is over C ni n, which amounts to the i.i.d. uniform permutations 
(7i, . . . , <7fc G 5„. The expectation on the r.h.s. is over the same fc-tuple of permutations.) 

Proof. Since t is even, 



Xa (r)* = f max < V n*= V /i* - V 

= |cp t (r)|-|cp t (n)|= ]T [^(ai,...,^)-!]. 

(Recall that we regard the spectrum of an operator as a multiset.) The claim is established by 
taking expectations. □ 

We shall assume henceforth that t is an even integer. Note that in the special case where ft = 
is a bouquet of | loops, Spec {Aq) = {d}, and CVt (fi) = (X U X -1 )', i.e. it consists of all words 



of length t in the letters X U X 1 (not necessarily reduced), so that CVt (j3**J 



= 01*. 



tFor convenience, we use in this paper the non-standard convention that permutations are composed from left 
to right. 
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Step II: The expected number of fixed points in w (a±, . . . , cr fe ) 

The next stage in the proof of the main results is an analysis of E [J r IUjn ]. This is where the results 
from |PP12| come to bear. Let = F (X) be the free group on k generators. Every word 
w € CVt (^) Q {X U A -1 ) corresponds to an element of F& (by abuse of notation we let w denote 
an element of (lUl -1 )' and of Fj, = F (X) at the same time; it is important to stress that 
reduction^! of w does not affect the associated permutation w (01, . . . , <7fc)0 The main theorem in 
|PP12| estimates the expected number of fixed points of the permutation w (ci, . . . ,o~k) £ S n , where 
Cti , . . . , &k S S n are random permutations chosen independently with uniform distribution. This 
theorem shows that this expectation is related to an algebraic invariant of w called its primitivity 
rank, which we now describe. 

A word w G Ffc is primitive if it belongs to a frasijl of F k . For a given w, one can also ask primitive, ba- 
whether w is primitive as an element of different subgroups of F& (which are free as well by a sis 
classical theorem of Nielsen and Schreier |Bog08[ Chap. 2.8]). If if is primitive in F&, it is also 
primitive in every subgroup J < F k (e.g. [Pudlli Claim 2.5]). However, if w is not primitive in 
Ffe, it is sometimes primitive and sometimes not so in subgroups containing it. Theoretically, one 
can go over all subgroups of F^ containing w, ordered by their ranj§, and look for the first time at 
which w is not primitive. First introduced in [Pudllj . the primitivity rank of w £ F k captures this 
notion: 

Definition 2.2. The primitivity rank of w € F&, denoted ir (w), is ir (w) 

ir(w) = min < rk ( J) 



we J <F k s.t. 

w is not primitive in J I 



If no such J exists, i.e. if w is primitive in Ffc, then tt (u>) = cxj. 

A subgroup J for which the minimum is obtained is called w-critical, and the set of w-critical 
subgroups is denoted Crit (w). Crit (w) 

For instance, ir (w) = 1 if and only if w is a proper power (w = v d for some v £ F^ and 
al > 2). By Corollary 4.2 and Lemma 6.8 in |Pudll| . in F^ the set of all possible primitivity ranks 
is {0, 1, 2, . . . , k} U {oo} (the only word w with 7r (w) = is w = 1). Moreover, ir (w) = oo iff w is 
primitive. The same paper also gives a method for computing ir (w). 

The following theorem estimates E [J 7 ^, n ]i the expected number of fixed points of w (cti, . . . , cr^), 
where a\ , . . . , 07. S S n are chosen independently at random with uniform distribution: 

Theorem 2.3. ]PP12[ Thm 1.7] For every w G F&, the expected number of fixed points in 
w (cti, . . . ,a k ) is 



In particular, it is also shown that Crit (w) is always finite. The three leftmost columns in Table 
[1] summarize the connection implied by Theorem 12.31 between the primitivity rank of w and the 
average number of fixed points in the random permutation w (01, . . . , a k )- 

With Theorem 12. 31 at hand, we can use the primitivity rank to split the summation in (|2.ip . We 
shall use the notation CP™ (f2) = {w £ CVt (^) | t (w) = m} for the subsets we obtain by splitting CV" 1 (f2) 



tBy reduction of a word we mean the (repeated) deletion of subwords of the form XiX i 1 or x i 1 Xi for some 

x x e x. 

$A basis of a free group is a free generating set. Namely, this is a generating set such that every element of the 
group can be expressed in a unique way as a reduced word in the elements of the set and their inverses. For F fe this 
is equivalent to a generating set of size k |Bog08| Chap. 2.29]. 

§The rank of a free group F, denoted rk (F), is the size of (every) basis of F. 



CVt (O) according to primitivity ranks: 



E 



Aa(T)*] < ]T (E[^«]-l) = 

- E E 

m=0u;GCPJ"(n) 



|Crit (w)| 

rtfll— 1 







(2.2) 



(note that for primitive words, i.e. words with 7r (u>) = oo, the expected number of fixed points is 
exactly 1, so their contribution to the summation vanishes.) 

Step III: A uniform bound for E [J- W}n ] 

depends on w. For a given w G CVt (O), this error term becomes 



{xux- 1 ) 1 . If 



The error term O ( n7r { m) ) i: 
negligible as n — s- oo. However, in order to bound the r.h.s. of (|2.2[) . one needs a uniform bound for 
all closed paths of length t in O. Namely, for every m one needs to control the O (■) term for all 
io G CPt (O) with 7r (w) = m simultaneously. The third stage is therefore the following proposition: 

Proposition (Follows from Prop. 15.11 and Claim l5~2|) . Let t = t(n) and w G ' 

t 2k+2 = ^ then 

E[J- W ,„]< 1 + + „(1)), 

where the o n (1) does not depend on w. 

Hence, as long as we keep t 2k+2 = o (n), we obtain: 

k 



E 



XAiT) 1 <(l + o n (l))J2 



1 



E 



|Crit (w)\ 



(2.3) 



Step IV: Counting words and critical subgroups 

The fourth step of the proof consists of estimating the exponential growth rate (as t — > oo) of 
the summation J2wecv m (n) |Crit(«;)| for every m G {0,1,..., k}. For m = 0, the only reduced 
word with 7T (w) = is w = 1, and its sole critical subgroup is the trivial subgroup {1}, so 
^2wecv a (fi) I Crit (w) \ = \CV t Moreover, words reducing to 1 are precisely the completely 

back-tracking closed paths, i.e. the paths lifting to closed paths in the covering tree. It follows that 
the exponential growth rate of \CV\ (0)1 is exactly p — pa (fl), the spectral radius of the covering 
tree (see Claim B~TU|) . For larger m we obtain the following upper bound: 

Theorem (Theorem 14.81) . Let be a finite, connected graph with k > 2 edges, and let m G 
{!,...,&}. Then 



lim sup 

t— >oo 



E l Crit HI 

wecv t n (n) 



< (2m - 1) • p. 



This upper bound is not tight in general. However, in the special case where f2 is <i-regular, we 
give better bounds: 



Theorem (Follows from Corollaries 14.51 and 14.141 and Theorem 18. 5j) . Let be a finite, connected 
d-regular graph (d>2>) with k edges, and let m G {0, !,...,&}. Then 



lim sup 

t— >oo 



E i Crit w 

wecv t n (n) 



< 



f2V2fc- 1 
1 2m- 1 



2fc-l 
2m- 1 



2m - 1 < V2fc- 1 
2m - 1 > V2fc- 1 
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Moreover, for fl = Bd the bouquet, there is equality: 

l/t 



lim sup 

t— >oo 



E 



|Crit H| 



2m - 1 < V2fc- 1 



[2m 



1 



2fc-l 



2m - 1 > V2fc - 1 



Remark 2.4. In fact, in the case of the bouquet, the growth rates in the statement of the last 
theorem remain the same if we assume every word has only a single critical subgroup. That is, the 
r.h.s. gives also the growth rate of the number of words in (X U X _1 ) with primitivity rank m - 
see Theorem 18.51 

Table [1] summarizes the content of Theorems 12. 3[ 14.81 and !8.dFL 



7T (w) 



Description of 



E [T w , n ] 



Growth rate 
for the bouquet 

B d 

2 



Bound for 
growth rate 
for general 



2V2fc - 1 



1 



a power 



1 + | Crit (w) 



2V2fc - 1 



E.g. 

[x 1 ,x 2 ] ,x?x$ 



I Crit(t»)| 



2V2fc - 1 



3p 



E.g. xlxlx\ 



| Crit(t»)| 



2V2fc - 1 



bp 



V2fc-1+1 
2 



2V2fc - 1 



V2fc-1+1 
2 



27T (>)-l+ 9 fV : 



2fc 



2/e-3 



E.g. 



| Crit(t»)| 



2k 



primitive 



1 



2k 



2k-3 



Table 1: Primitivity rank, the average number of fixed points, the exponential growth rate of 
( \ | Crit (w)\, and bounds on the exponential growth rate of '^2 w£ cv rn (n) I Crit (w)\. 



weevil Bd 



Whereas in the special case of the bouquet we count words in of a given length and a given 
primitivity rank, the case of a general graph concerns the equivalent question for words which in 
addition belong to some fixed subgroups of (There is one such subgroup for each vertex v of CI: 
the one consisting of the words which correspond to closed paths at v.) The fact that the bounds 
in Corollary [531 are better than those in Theorem 14.81 explains the gap between Theorems 1 1 . 1 1 and 
11.51 which are tight up to a small additive constant, and Theorem 11.41 which is tight up to a small 
multiplicative factor (assuming, of course, that Conjecture II .31 is true). 

tThe number 2k — 2 + 2 k—3 m * ne ' as * row °^ * ne table is the exponential growth rate of the set of primitives 
in Ffc, namely of |cP^° (Primitive words have no critical subgroups.) This result is not necessary for the 

current work, and is established in a separate paper IPW13I . We use it here only to show that our bounds for 
T2 / \ I Crit (to) I are tight - see Section[H] 
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Step V: Some analysis 



The final step is fairly simple and technical: it consists of analyzing the upper bounds we obtain 
from (|2.3[) together with Theorem 14.81 and Corollary 14.51 We seek the value of t (as a function of 
n) which yields the best bounds. 

The paper is arranged as follows. Section [3] provides some basic facts about the concepts of 
core graphs and algebraic extensions which are used throughout this paper. In Section 0] we bound 
the number of words and critical subgroups and establish the fourth step of the proof (first for 
the special case of the bouquet, in Section |4~T1 then for the general case in Section l4~2l and finally 
for the case f2 is arbitrary ci-regular in Section 14. 3[) . The third step of the proof, where the error 
term from Theorem 12.31 is dealt with, is carried out in Section [5j where we have to recall some 
more details from [PP12J . Section |5] completes the proof of Theorems 11.11 and 1 1 . 5 1 and addresses the 
source of the gap between Theorem 11.11 and Friedman's result. In Section [7] we complete the proof 
of Theorem ll.4l We end with results on the accurate exponential growth rate of words with a given 
primitivity rank in F& (Section [8]), and then list a few open questions. The appendices provide 
some background on the relation between different models of random d-regular graphs and between 
different models of random coverings (Appendix |X| , and on the theory of spectral expansion of 
non-regular graphs (Appendix [Bj. 



3 Preliminaries: Core Graphs and Algebraic Extensions 

This section describes some notions and ideas which are used throughout the current paper. 
3.1 Algebraic Extensions 

Let H < J be subgroups of F^. We say that J is an algebraic extension of H and denote H < a i g algebraic 
J, if there is no intermediate subgroup H < L < J which is a proper free factoi0 of J. The extension 
name originated in |KM02| . but the notion goes back at least to |Tak51| . and was formulated H < a i g J 
independently by several authors. It is central in the understanding of the lattice of subgroups 
of F. For example, it can be shown that every extension H < J of free groups admits a unique 
intermediate subgroup H < a ig M <// J (where <ff denotes a free factor). Moreover, if H < F <ff 
is a finitely generated subgroup, it has only finitely many algebraic extensions in F. Thus, every 
group containing H is a free extension of one of the algebraic extensions of H , which is a well known 
theorem of Takahasi |Tak51| . For more information we refer the interested reader to [KM02, PP12] 
and especially to |MVW07| . 

The importance of algebraic extensions in the current paper stems from the following easy 
observation: 

Claim 3.1. JPudlli Sec. 4] Every w-critical subgroup is an algebraic extension of (w) . 

More precisely, Crit (w) consists precisely of the algebraic extensions of (w) of minimal rank besides 

(w) itself. 

To see the claim, assume that H is a w-critical subgroup of F^. Obviously, (w) < H. If H is not 
an algebraic extension of (w), then there is a proper intermediate free factor (w) < L <ff H. Since 
iv is not primitive in H, it is also not primitive in L (e.g. [Pudlll Claim 2.5]), but rk(L) < rk(M), 
which is a contradiction. Below, we use properties of w-critical subgroups which are actually shared 
by all proper algebraic extensions of (w). 

tlf H < J are free groups then H is said to be a free factor of J if a (every) basis of H can be extended to a 
basis of J. 

t Unless w = 1 in which case Crit (w) = {{}} = {(w)}. 



11 



3.2 Core Graphs 



Fix a basis X = {x\, . . . , Xk} of F&. Associated with every subgroup H < F& is a directed, pointed 
graph whose edges are labeled by X . This graph is called the core- graph associated with H and is 
denoted by Fx (H). We illustrate the notion in Figure [37X1 Tx (H) 

To understand how Tx (H) is constructed, recall first the notion of the Schreier (right) coset 
graph of H with respect to the basis X, denoted by Fx (H). This is a directed, pointed and edge- Tx (H) 
labeled graph. Its vertex set is the set of all right cosets of H in F^, where the basepoint corresponds 
to the trivial coset H . For every coset Hw and every basis-element Xj there is a directed j-edge 
(short for Xj-edge) going from the vertex Hw to the vertex i/uiXjQ 

The core graph Tx (H) is obtained from Tx (H) by omitting all the vertices and edges of 
Tx (H) which are not traced by any reduced (i.e., non-backtracking) path that starts and ends at 
the basepoint. Stated informally, we trim all "hanging trees" from Tx (H). To illustrate, Figure |3~T1 
shows the graphs Tx (H) and Tx (H) for H = {x\X2X^[ 3 , x 2 X2X^ ) < F2. 




Figure 3.1: Tx (H) and Tx (H) for H = (xiX2Xi 3 , x 2 X2x\ 1 ) < F2. The Schreier coset graph 
Tx (H) is the infinite graph on the left (the dotted lines represent infinite 4-regular trees). The 
basepoint "(g)" corresponds to the trivial coset H, the vertex below it corresponds to the coset Hxi, 
the one further down corresponds to Hx 2 = Hx\X2X~^ 1 , etc. The core graph Tx (H) is the finite 
graph on the right, which is obtained from Tx (H) by omitting all vertices and edges that are not 
traced by reduced closed paths around the basepoint. 

If T is a directed pointed graph labeled by some set X, paths in T correspond to words in F (X) 
(the free group generated by X). For instance, the path (from left to right) 

12 x 2 ii x 2 23 x 2 ii 

• S- • 5- • S- • -t • S- • 5- • -< • 

corresponds to the word x 2 2 xix^ 1 X3X2X1 1 . The set of all words obtained from closed paths around 
the basepoint in T is a subgroup of F (X) which we call the labeled fundamental group of T, and 
denote by it* (T). Note that (T) need not be isomorphic to %i (T), the standard fundamental it* (T) 

group of T viewed as a topological space - for example, take T = %i C^®^) X1 ■ 

1 Alternatively, Fx (H) is the quotient H\T, where T is the Cayley graph of with respect to the basis X , and 
_Ffc (and thus also H) acts on this graph from the left. Moreover, this is the covering-space of Tx (Fk) = Tjc (-Pfe)j 
the bouquet of k loops, corresponding to H, via the correspondence between pointed covering spaces of a space Y 
and subgroups of its fundamental group tti (Y). 
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However, it is not hard to show that when T is a core graph, then tt* (T) is isomorphic to tt\ (T) 
(e.g. |KM02| ). In this case the labeling gives a canonical identification of ir\ (T) as a subgroup of 
F (X). It is an easy observation that 

Trf (T x (H)) = 7T* (T x (H)) = H (3.1) 

This gives a one-to-one correspondence between subgroups of F (X) = Fj. and core graphs labeled 
by X. Namely, 7rf- and Fx are the inverses of each other in a bijection (Galois correspondence) 

f Subgroups 1 ±1> ( Core graphs 1 

1 ofF(A) J \labelcdbyAj 1 ' ' 

Core graphs were introduced by Stallings |Sta 83j. Our definition is slightly different, and closer to 
the one in |KM02| IMVW07| in that we allow the basepoint to be of de gree one, and in that our 
graphs are directed and edge-labeled. 

We now list some basic properties of core graphs which are used in the sequel of this paper 
(proofs can be found in |Sta83l IKM02I IMVW07I IPudllj V 

Claim 3.2. Let H be a subgroup o/Ffc with an associated core graph T = Tx (H). 

(1) rk (H) < oo <^=> r is finite. 

(2) tk(H) = \E{Y)\ - |V(r)| + 1 (forf.g. subgroup H). 

(3) The correspondence &3.2\) restricts to a correspondence between finitely generated subgroups of 
Ffc and finite core graphs. 

A morphism between two core-graphs is a map that sends vertices to vertices and edges to edges, 
and preserves the structure of the core graphs. Namely, it preserves the incidence relations, sends 
the basepoint to the basepoint, and preserves the directions and labels of the edges. As in Claim 
I3~21 each of the following properties is either proven in (some of) |Sta831 IKM021 IMVW071 IPudllj 
or an easy observation: 

Claim 3.3. Let H, J, L < F& be subgroups. Then 

(1) A morphism Tx (H) — > Tx ( J) exists if and only if H < J. 

(2) If a morphism Tx (H) — > Tx (J) exists, it is unique. We denote it by n^^j. r]^_ 

(3) Whenever H < L < J, rj§_^j = rj^j o r?^ L 13 

(4) IfT x (H) is a subgraph ofT x (J), namely ifr]£^j is infective, then H < ff 

(5) Every morphism in an immersion (locally infective at the vertices). 

4 Counting Words and Critical Subgroups 

In this section we bound the exponential growth rate (as t — > oo) of 

E ICritHI- 

u>ee-p t m (Q) 

tPoints (l)-(3) can be formulated by saying that i1-!.2| i is in fact an isomorphism of categories, given by the 
functors ir^ and Tx- 

*But not vice- versa: for example, consider (xxx^) <// F2. 
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For the special case of the bouquet with k = | loops, where CV% f^f ) =(^UI we find the 
accurate exponential growth rate. The bound for a general graph f2 is given in terms of the spectral 
radius p = pa (Q) of the universal covering tree of f2. 

We begin with a key lemma to be used in the proofs of all cases (a bouquet, a d-regular base 
graph and an arbitrary base graph): 

Lemma 4.1. Let w € and let N </. ff . Fj, be a proper algebraic extension of (w) . Then the 
closed path in Tx (N) corresponding to w traces every edge at least twice. 

Proof. First, we claim that every edge is traced at least once (in fact, even more generally, if 
H <ai g N then N is onto: see Definition |5Ji| and e.g. |PP12[ Claim 4.2] . We repeat the simple 
argument here.) Otherwise, let J be the subgroup of N corresponding to the subgraph A traced 
by w (so A = imrj^^^), and J = it* (A), see Section \3l2\ and in particular Claim |373|) . Then 
wE J <ff N (Claim 13. 3(1 , contradicting the fact that N is an algebraic extension of (w) . 

Next, we distinguish between separating edges and non- separating edges in Y = Tx (N). If e 
is a separating edge, namely if removing e separates T into two connected components, then it is 
obvious that the path of w in F must traverse e an even number of times, and since this number is 
> 1, it is in fact > 2. 

Finally, assume that e is not separating, and w traverses it exactly once, so that the path 
corresponding to w in Tx (N) is w\ewi (with wi, W2 avoiding e; we think of e as oriented according 
to the direction of w) . Choose a spanning tree T of Tx (N) which avoids e to obtain a basis for TV 
as follows. There are r = rk (JV) excessive edges e = e±, &2, ■ ■ ■ , e r outside the tree, and they should 
be oriented arbitrarily. For each 1 < i < r let ui be the word corresponding to the path that goes 
from ® to the origin of ei via T, then traverses and returns to ® via T. It is easy to see that 
{v,i, . . . , u r } is a basis of N. We claim that so is {w, U2, ■ ■ ■ , u r }, so that w is primitive in N and 
therefore (w) <// N, a contradiction. 

It is enough to show that u\ € (u>,«2, . . . ,u r ). Let p\ be the path through T from ® to the 
origin of e, and P2 the path from the terminus of e back to ®. Then 

Mi = piep 2 = p 1 w^ 1 w 1 ew 2 W2 1 p2 = (piw^ 1 ) w (w2 1 p 2 ) 

and we are done because piWi 1 and w~^ x p2 avoid e and are thus generated by {«2, . . . , u r }. □ 

We will also need the following simple properties of the core graph of a subgroup of rank m. A 
'topological edge' of a graph is an edge of the graph obtained after ignoring all vertices of degree 2, 
except for (possibly) the basepoint C§>. 

Claim 4.2. Let T = Tx (J) be the core graph of a subgroup J < of rank m. Then, 

(1) After omitting the string to ® if the basepoint is a leaf, all vertices of T are of degree at most 
2m. 

(2) T has at most 3m — 1 topological edges. 

Proof. (1) After ignoring £§) and the string leading to <E> in case it is a leaf, all vertices of T are of 
degree > 2 . Thus all summands in the l.h.s. of 

[deg (v) - 2] = 2 \E (T)| - 2 \V (T)| = 2m - 2 

vev(r) 

are non-negative. So the degree of every vertex is bounded by 2 + (2m — 2) = 2m. In fact, there is 
a vertex of degree 2m if and only if T is topologically a bouquet of m loops (plus, possibly, a string 
to (8). 

(2) Consider T as a 'topological graph' as explained above. Let e and v denote the number of 
topological edges and vertices. It is still true that e — v + 1 = m, but now there are no vertices of 
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degree < 2 except for, possibly, the basepoint. Therefore, the sum of degrees, which equals 2e, is 
at least 3 (v — 1) + 1. So 

2e > 3 (v - 1) + 1 = 3 (e - m) + 1 



so e < 3m — 1. 



□ 



4.1 The Special Case of the Bouquet 

For the special case where Q = B± is the bouquet of k = 4 loops, our goal is to bound the 

2 * 

exponential growth rate of 



E 



|Crit(«;)|= |Crit(» 

we(XUX-i) t :Tr(w)=m 



In order to estimate this number we first estimate the exponential growth rate of the parallel 
quantity for reduced words: 

Proposition 4.3. Let k > 2 and m <G {1,2,..., k}. Then 

l/t 

fV2k - 1 2m - 1 < ^2k - 1 



lim sup 

t— > oo 



E 



|Crit H| 



U)GF fc : 
|iu| = t & 7r(u>)— m 



< 



[2m -1 2m - 1 > ^2k - 1 



Put differently, the lim sup is bounded by max { ^2k — 1, 2m — 1 } (we present it in a lengthier 
way to stress the threshold phenomenon). It seems that this is not only an upper bound but the 
actual exponential growth rate - see Section [S] 



Proof. Note that 

E 



|Crit H| 



w£F k : 
\w\—t&€7r(w)—m 



< 



< 



E 

J<F fc :rk(J)=rn 

E 

J<F fc :rk( J)=m 

E 

J<F fc :rk( J)=m 



|{weFfc|M =t, JeCritHH 

|{tD G Ffe I | W | = t, (w) < a l g J}\ 



(4.1) 



we J 



| u; | = t, w traces each edge I 
of Tx (J) at least twice 



where the first inequality stems from Claim 13.11 and the second from Lemma 14.11 We continue to 
bound the latter sum. For each J < F^ let v t (J) denote the corresponding summand: 



w e J 



|u;| = t, w traces each edge | 
of Tx (J) at least twice 



We classify all J's of rank m by the number of edges in Fx (J)- Consider all X-labeled core- 
graphs r of total size 5t and rank m (so that St is an integer, of course). Since we are counting 
words of length t tracing every edge at least twice, v t (J) = if 8 > h. So we restrict to the case 
5 € [0, i] . The counting is performed in several steps: 

• First, let us bound the number of unlabeled and unoriented connected pointed graphs with St 
edges and rank m (here the rank of a connected graph is e — v + 1). As in the proof of Lemma 
14.11 each such graph has some spanning tree and m excessive edges. The paths through the 
tree from <E) to the origins and termini of these edges cover the entire tree. Denote these 
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paths by pi.ii Pi,2, P2.1, P2.2-, ■ • ■ ,Pm,i,Pm.2- We 'unveil' the spanning tree step by step: first 
we unveil pi i. The only unknown is its length £ {0, 1, . . . , St — 1}. Then px,2 leaves px,i at 
one of < St possible vertices and goes on for some length < St. Now, p2,i leaves pii Upi^ at 
one of < St possible vertices and goes on for < St new edges. This goes on 2m times in total 
(afterward, the ends of p^i and Pi t 2 are connected by an edge). In total, there are at most 

( (St) j = (St) m possible unlabeled pointed graphs of rank m and with St edgeaj. 

Next, we bound the number of labelings of each such graph T (here, the labeling includes 
also the orientation of each edge). Label some edge (there are 2k options) and then gradually 
label edges adjacent to at least one edge which is already labeled (at most 2k — 1 possible 
labels for each edge). Over all the number of possible labelings of V is < 2k ■ (2k — 1) 



St-i 



For a given labeled core-graph T, let J = 7iy (T) be the corresponding subgroup. We claim 
that v t (J) < ('it 2 ) m ■ (2m — 1 )( 1-2<5 ) t . Indeed, note first that if the basepoint ® is a leaf, 
then every reduced w must first follow the string from (g> to the first "topological" vertex 
(vertex of degree > 3), and then return to the string only in its final steps back to ®. So we 
can assume w traces a leaf-free graph of rank m and at most St edges. A reduced word w £ J 
which traces every edge at least twice, also traverses any topological edge at least twice, each 
time in one shot (without backtracking). Each time w traces some topological edge e in V, it 
begins in one of < t possible positions (in w), and from < 2 possible directions of e. So there 
< 4t 2 possible ways in which w traces e for the first two times. By Claim l4~2lf 2|) there are at 
most 3m — 1 topological edges, and so at most (4t 2 ) ™ possibilities for how w traces each 
topological edge of V for the first two times. The rest of w is of length (at most) (1 — 28) t. 
and in every step there are at most 2m — 1 to proceed, by Clain I4.2tf T|) . 



Hence, 



"t(J) < (St) 4m -2k (2k -if' 1 •(4t 2 ) d "^ i (2m -1 



(l-2<5)t 



J<F k :rk(J)=i 

\r x (j)\=st 



< c-t L 



c-t L 




(4.2) 



Recall that 5 £ [0, g]. We bound J2j<F k :rk(J)=m v t (J) by f times the maximal possible value of 
the r.h.s. of (|4.2p (when going over all possible values of S). When 2m — 1 < \j2k — 1, the r.h.s. of 
(14.21) is largest when 8 = h, so we get overall 



vt(J)<c-t 

J<F k :rk{J)-m 



10m- 1 



V2fc - 1 



(4.3) 



For 2m — 1 > y/2k — 1, the r.h.s. of (|4.2p is largest when S — 0, so we get overall 



^2 vt(J)<c- 1 10 " 1 " 1 ■ [2m - l] 1 

J<F k :rk(J)=m 



The proposition follows. 



□ 



tA tighter bound of (<5i) 3m can also be obtained quite easily. We do not bother introduce it because this 
expression is anyway negligible when exponential growth rate is considered. 
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The next step is to move from reduced words to non-reduced words. To this goal, we use an 
extended version of the well known cogrowth formula due to Grigorchuk |Gri77| and Northshield 
|Nor92| . Let T be a connected d-regular graph. Let b-p, v (t) denote the number of cycles of length t 
at some vertex v in L, and let nr, v (t) denote the size of the smaller set of non-backtracking cycles 
of length t at v. The spectral radius of At, denoted rad (rj3, is equal to limsup^^ br.v (t) 1 ^ 
(in particular, this limit does not depend on v). The cogrowth of L is defined as cogr(L) = cogr(-) 
limsupj^.^ nr v CO > an d is also independent of v. The cogrowth formula expresses rad(L) in 
terms of cogr (T). 

Another way to view the parameters rad (T) and cogr (Y) is the following: let Td be the d- 
regular tree with basepoint ®, let p : Td — > Y be a covering map such that p(®) = v, and let 
S = p^ 1 (v) C V (Td) be the fiber above v. Then br,v (t) is the number of paths of length t in Td 
emanating from g) and terminating inside S. Similarly, nr,v (0 is the number of non-backtracking 
paths of length t in Td emanating from <S> and terminating in S. This is also equal to the number 
of vertices in the t-th spheral of Td belonging to S. 

For our needs we need to introduce an extended formula applying to other types of subsets S 
of V (Td), which do not necessarily correspond to a fiber above a graph. Even more generally, we 
extend the formula to a class of functions on V (Td) (this extends the previous case if S is identified 
with its characteristic function 

For / : V (T d ) -> R, denote by (3 f (t) the sum [3 f (t) 

p: a path from (g) 
of length t 

over all (possibly backtracking) paths of length t in Td emanating from ®. Similarly, denote by 

Vf (t) the same sum over the smaller set of non-backtracking paths of length t emanating from ®. Vf (t) 

Let {ti}°^ 1 ,{rj}°° =1 be subsequences of N such that if {y n } is any simple biased random wallfl 

then the probability that y r . is equal to some is bounded away from zero (for large enough j) 

(conceptually, {ti} should be "dense" enough). Two obvious examples are ij = r, = i and also 

ti = r.i = 2i. The sequences ti = ci, r» = i also satisfy this condition provided cgNis odd. 

Theorem 4.4. [Extended Cogrwoth Formula ]Gri77l \Nor92[ \Pudl2^ ] Let d > 3, / : V (T d ) -> K, 
fif (t), Vf (t), {U} and {rj} be as above. For a £ [0,d— 1] define 

9( a ) = \ d _i . . n — T ■ 4 -4 

I ^ +ot a > Vd - 1 

Then, 

(1) If v S (t) < c ■ a 1 then lim sup^ (3 f (t) 1/l <g(a). 

(2) If vj (U) >c-a u then liminf,^ f (rj) 1 ^ 3 >g(a). 

(3) 7/limj^.oo Vf (ti) 1/l ' = a, then lim^oo (3 f (rj) 1/rj = g(a). 

The original cogrowth formula which applies to d-regular graphs Y , determines that rad (r) = 
g (cogr (r)). The proof of Theorem 14.41 appears in |Pudl2j (this includes a new proof of the original 
cogrowth formula). With this theorem at hand, one can obtain the sought-after bound on the 
number of non-reduced words from the one on reduced words. 



tlf T is finite, rad (r) = d. If T is the ci-regular tree, rad (T) = 2 v / d zr T. 

•fThe t-th sphere of the pointed is the set of vertices at distance t from ®. 

§I.e. yo = 0, Vn+i = Un + 1 with probability p for some p E (0, 1) and y n +i = yn — 1 otherwise. 
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Corollary 4.5. For every k > 2 and m €{!,..., k}, 



lim sup 

t— f oo 



E 



|Crit {w)\ 



wecv™ b 



< 



] 2fc-l 
t 2m-l 



1 

2m 



2m 
2m 



1 < V2fc - 1 



1 > y/2k - 1 



Proof. Consider the the Cayley graph of which is a 2fc-regular tree. Every vertex corresponds 

I Grit (to) I . The corollary then follows by applying 

□ 



to a word in , and we let f m {w) 
Theorem !4.4lf Tj) on f m , using Proposition 14.3 



In Section |8] it is shown (Theorem 18. 5ft that the bound in Corollary 14. 51 represents the accurate 
exponential growth rate of the sum, and even merely of the number of not-necessarily-reduced words 
with primitivity rank m. 

Remark 4.6. Interestingly, the threshold of y/2k — 1 shows up twice, apparently independently: 
both in Proposition 14.31 and in the (extended) cogrowth formula. 

Finally, for m = there is exactly one relevant reduced word: w = 1, and this word has exactly 
one critical subgroup: the trivial subgroup. Thus, it suffices to bound the number of words in 
(X U X -1 ) reducing to 1. This is a well-known result: 

Claim 4.7. 



lim sup 



CVt ( B ± 



lim sup 



{we{ 



w e (xux 



w reduces to 1 



2\/2k- 1. 



Proof. Denote by cr (t, u, v) the number of paths of length t from the vertex u to the vertex v cr (t, u, v) 
in a connected graph T. If, as above, Ap denotes the adjacency operator on l 2 (V(T)), then 
cr (t,u,v) = (A^8 U ,5 V )-. ((■, stands for the standard inner product). As mentioned above, if T 
has bounded degrees, then Ay is a bounded self-adjoint operator, hence 



rad (r) = || A r || = lim sup \/cr (t,u,v) 
t— f 00 



(4.5) 



for every u, v £ V (T). Moreover, 

c r (t,u,v) = (A£5 U ,5 V ) 1 < \\A& U \\ ■ \\S V \\ < \\A r \\* ■ \\S U \\ ■ \\5 V \\ = rad (T) 1 (4.6) 

(For these facts and other related ones we refer the reader to |Lyol2| §6]). 

The words of length t reducing to 1 are exactly the closed paths of length t at the basepoint of the 
2fc-regular tree T2k- So the number we seek is 

limsupj^^ yj CT 2k (f , v, v), which therefore equals rad(T2/c) = 2\j2k — 1. □ 



4.2 A General Base-Graph 

We now return to the general case of a general base graph Q. Theorem 14.81 below is needed for 
proving the bound on the spectrum of the adjacency operator on T, the random covering of Q in 
the C n> Q model (the first part of Theorem II .4p . The small variation needed for the second part of 
this theorem, dealing with the Markov operator, is discussed in Section [7. II 

Recall that T denotes the universal covering of fl (and of T) , and p = pa(&) denotes the spectral 
radius of its adjacency operator. We also let rk (f2) denote the rank of the fundamental group of il, 
so rk (f2) = \E (0)| — \ V (Q)| + 1. The main theorem of this subsection is the following: 

Theorem 4.8. Let O be a finite, connected graph, and let m € {1, . . . , rk (O)}. Then 



l/t 



lim sup 



E l Crit HI 



< (2m- 1) • p. 



T 

rk(ft) 
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Before proceeding to the proof of this theorem, let us refer to the cases which are left out: when 
7r (w) > rk(O) and when tt (w) = 0. First, it turns out there are no words in CVt (0) admitting 
finite primitivity rank which is greater than rk(fi): 

Lemma 4.9. Let tt be a finite, connected graph. Then ir (w) £ {0,1, . . . ,rk(Q) ,00} for every 

w£CV t (tt). 

Proof. Recall from Section [2] that we denote k = \E (tt)\ and orient each of the k edges arbitrarily 
and label them by x\, . . . ,Xk- With the orientation and labeling of its edges, tt becomes a non- 
pointed X-labeled graph, where X = {xi, . . . , Xk}- (This is not a core-graph, for it has no basepoint 
and may have leaves.) So every path in tt of length t can be regarded as an element of (X U X -1 )' 
and (after reduction) of = F (X). If a word w £ CVt (ty begins (and ends) at v £ V (ft), then 
w £ J v , where J v = (tt v ) is the subgroup of F^ corresponding to the X-labeled graph tt pointed J v , tt v 
at v. The rank of J v is independent of v and equals rk (ft). It is easy to see that J v <tf Fk (recall 
that '<//' denotes a free factor): obtain a basis for J v by choosing an arbitrary spanning tree and 
orienting the edges outside the tree, as in the proof of Lemma |4~T1 This basis can then be extended 
to a basis of Fk by the x^s associated with the edges inside the spanning tree. So if w is primitive 
in J v , is it also primitive in F^ and tt (u>) = oo. Otherwise, tt (w) < rk ( J v ) = rk (tt). □ 

As for m — 0, i.e. words reducing to 1, recall that the trivial element of F& has exactly one 
critical subgroup, so X^togc-p^fo) I Grit (w)\ equals \CV\ (tt)\. 

Claim 4.10. 

limsup |CP? (ft) | V * = p. 

Proof. For a given vertex v £ V (tt), each cycle at v of length t reducing to 1 lifts to a cycle in T 
at v, where v £ p^ 1 (v) is some vertex at the fiber above v of the covering map p : T — > tt. This 
number is thus [A^,8y]~, and 

[A*5v] d = (AS6v,S v ) 1 < \\Aj\\ • Hfell 2 = = p' 
(the last equality follows from At being a self-adjoint operator), and thus 

limsup \CV° t (tt)\ X/t < limsup [\V (tt)\ ■ p*] l/t = p. 

t— >oo i— ^oo 

To show there is actually equality, repeat the argument from Claiir l4~7l □ 
We return to the proof of Theorem l4~8l By Claim l3Tl 

|Crit(w)| = \{w £CV t (tt)\N £ Grit (to) }| 

w£CVr(n) N<F k : 

rk(N)=m 

< Yl \{^^CV t (tt)\(w) < alg N}\ (4.7) 

JV<F fc : 
rk(AT)=m 

and we actually bound the latter summation. For every N < Fk, we let (3t (N) denote the corre- f3 t (N) 
sponding summand, namely 

/3 t (N) = \{w £ CVt (fi) | (w) < alg N}\ . 

Note that while a non-reduced element w £ CVt (ft) with w £ N might not correspond to a close 
path in Tx (N), it always does correspond to a close path at the basepoint of the Schreier coset 
graph T X (N). 

If N < Ffe satisfies that the basepoint of Tx (N) is not a leaf, call TV and its core-graph CR CR 
(cyclically reduced). 
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CO o — o 



Figure 4.1: The three CR representatives of topological graphs of rank 2: Figure- Eight, Barbell 
and Theta. 



Claim 4.11. If N <F k is CR then 

N' is conjugate to N 

Proof. The Schreier graphs of TV and of any conjugate of it differ only by the basepoint. If TV' is 
some conjugate of TV and w' £ CVt (CI) satisfies («/) < a ig N', then the path corresponding to w' in 
the Schreier graph Tx (N') must visit the basepoint of Tx (N) (by Lemma f4 . 1 1) . So there is some 
cyclic rotation w of w' satisfying (w) < a i g TV (clearly, w also belongs to CVt (CI)). On the other 
hand, each such w has at most t possible cyclic rotations, each of which corresponds to one w' and 
one TV'. □ 

Recall from the proof of Lemma [4.91 that J v < denotes the subgroup corresponding to Cl v , 
the X-labeled graph pointed at v, for some v £ V (CI). 

Claim 4.12. f3 t (TV) = unless TV < J v for some vertex v £ V (CI). 

Proof. Indeed, for each v, as J v <ff F&, it follows that J v fl TV <ff TV (see e.g. [PP121 Claim 3.9]). 
So if w £ CVt (Q) belongs to TV and begins at vertex v, then w belongs to the free factor J v n TV of 
TV, which is proper, unless TV < J v . □ 

Next, we classify the subgroups TV < Ffe according to their "topological'' core graph A. This is 
the homeomorphism class of the pointed Tx (TV). Namely, this is a graph obtained from Fx (TV) 
by ignoring vertices of degree two, except for (possibly) the basepoint (as in Claim |4"T2")) . As Claim 
14.111 allows us to restrict to one CR representative from each conjugacy class of subgroups in Ffc, 
we also restrict attention to one CR representative A of each "conjugacy class" of topological core 
graphs. Ignoring the basepoints, any A' in the "conjugacy class" of A retracts to this representative. 
For example, we need exactly three such representatives in rank 2, as shown in Figure |4~T1 

The following proposition is the key step in the proof of Theorem 14.81 

Proposition 4.13. Let A be a pointed finite connected graph without vertices of degree 1 or 2 except 
for possibly the basepoint, and let 8 denote its maximal degree. Then the sum of /3t (TV) over all 
subgroup TV < whose core graph is topologically A is at most 

\v(n)\-(±t 4 ) mA)l -(s-if-pK 

Proof. Denote r = |T3(A)|. Order and orient the edges of A {e,\,ei, . . . ,e r } so that e\ emanates 
from (8), and for every i > 2, e,; emanates either from ® or from a vertex which is the beginning or 
endpoint of one of e\, . . . , e.i—\. In addition, let vq denote ® and Vi denote the endpoint of for 
1 < i < r. For example, one can label the barbell-shaped graph as follows: (^<8> — eT^*'^) 1 where 
v o = v 3 are ® and V\ = v% are •. Also, denote by beg (i) the smallest index j such that begins at 
Vj, so e,; is a directed edge from fbcg(i) to Vi and beg (i) < i. In our example, beg (1) = beg (3) = 
and beg (2) = 1. 

Note that each TV corresponding to A is determined by the paths (words in Ffc) associated with 
ei, . . . ,e r . From Claim l4~T2l it follows one can restrict to subgroups TV which are subgroups of J v 
for some v £ V (CI). So fix some vq £ V (f2) and also some vq £ V (T) which projects to vo- We 
claim that every subgroup TV < J v corresponding to A is completely determined by a set of vertices 
V\, . . . , v r in T: the topological edge in V x (TV) associated with corresponds to the path in T 
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from Wbog(i) to Vi. (There are some constraints of the choice of the v^s. For example, if Vi = Vj then 
Vi and Vj must belong to the same fiber of the projection map p : T — > Q.) So instead of summing 
over all possible iV's, we go through all possible choices of vertices v±, . . . ,v r in T. 

The counting argument that follows resembles the one in Proposition 14.31 Fix a particular 
N < J v corresponding to A and let v\, . . . ,v r be the corresponding vertices in T. By Lemma 14.11 
if w G (X U A -1 ) satisfies (w) < a ig N, then its reduced form traverses every topological edge of 
Tx (N) at least twice. For each i, assume that w first traverses the topological edge associated 
with ei starting at position r^i and through ^x steps, and then from position through ^2 
steps (recall that w is not reduced so 1^% may be different from 4,i). The directions of these 
traverses are £i,i,£;,2 G {±1}- I n total, there are less than t 2r options for the r^'s, less than t 2r 
options for the £jj's and less than 2 2r for the e^j's: a total of less than (4i 4 ) options. There are 
t — i\ \ — ^1,2 — ■•■ — l r .\ — @r,2 remaining steps, and these are divided to at most 4r segments (we can 
always assume one of the t^x's equals 0). Denote the lengths of these segments by qi, . . . , q^ r (some 
may be 0). The i'th segment reduces to some path in Tx (N), with at most (S — l) q ' possibilities. 
Overall, there are at most (<5 — y^i 1+ '" +q4,r < (8 — 1)' options to choose the reduced paths traced 
by these 4r segments in w. Given such a reduced path, let Xi, % G V (T) be suitable vertices in the 
tree such that the reduced path equals the unique reduced path from Xi to yi. 



Now, we sum over all subgroups N corresponding to A and all words w G CVt with (w) < 



alg 

N. By adding a factor of \V (Q)| (it 4 ) ■ (S - if we assume we already know Do and Vq, the T^j's, 
£ij's, £i,j's, the qiS and the reduced 4r paths. Moreover, conditioning on knowing v±, . . . , v r , we also 
know the Xi's and the g/j's. Recall that cr (t, u, v) denotes the number of paths of length t in a graph 
F from the vertex u to the vertex v, and that by (|4.6j) . ct (t, u, v) < p l for every it, v G V (T). For 
each i = 1, . . . ,r and j = 1,2, there are ct (^i,j,Ub eg (i),Uj) possible subwords corresponding to the 
j'th traverse of e, (even if e if j = -1, because c T (£i,j,Vb eg (i),Vi) = c T (ti,j,Vi, Ubeg(i)))- Similarly, 
there are at most ct (qi, Xi, %) subwords corresponding to the the i'th intermediate segment. Thus, 
if a = \V{Sl)\ ■ (4i 4 )' r • (S-lf then 



E &w < «■ E 

«i,..,« r ev(T) 



iV<F fe: 



r 2 

J C T (tij,V hes (i),Vi) 
t=l j=l 



~[c T (qi,Xi,m) 



1=1 





' 4r 




< a ■ 




E 




.i=l 


«l,...,VreV(T) 



r 2 

nri cT (^j.^eg(»),ui) 
<=i j=i 

Note that beg (i) < i, so ct (iij , ^beg(i) > ^i) only depends on i^j and «o 7 • ■ • , Vi (and not on Vi+i , 
Therefore, if we write / (i) = 11?= x c t (^i.j: ^bog(i) 7 ^i) ' we can s pHt the sum to obtain: 



,v r ). 



E A (N) < a-pS* ^ /(i) 

Siev(T) 



7V<F fc : 

r x (Ar)-A 



E /( 2 ) 

t> 2 ev(T) 



The following step is the crux of the matter. We use the fact that each topological edge is 
traversed twice to get rid of the summation over vertices in T. We begin with the last edge e r , 
where we replace the expression J"]* e v(T) f ( r ) as follows: 

E = E C T (4,1 7 Wbcg(r), «r) C T (^r,2,Wbog(r),«r) 

v r ev(T) v r ev{T) 

= E ° T i^r,l,V hes ( r ),V r ) C T (£ r ,2,V r ,Vb eS (r)) 

v r ev(T) 

= C T (l rA + ^r,2,«beg(r),«beg(r)) < p lr - 1+lr - 2 . 
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The crucial step here was the equality = . It follows from the fact that v r can be recovered 
as the vertex of T visited by the path of length i r> i + l r ^, after l r ,\ steps. After "peeling" the 
expression J2v r ev(T) f ( r )' we can S° 011 anc ^ bound S^ r _ l£ v(x) / ( r — 1) by p l ^- 1 - 1 +^- 1 - 2 and so 
on. Eventually, we obtain 

r 

E &w ^ « .pE*jj^,i-M«, a = |v (n)i -(^y-is- i)*. P *. 

iV<F fc : i=l 
r x (W)SA 

□ 

Finally, we are in position to establish the upper bound stated in Theorem 14.81 Fix m € 
{1,2,..., rk (CI)}. Then by (02} and Claim OB 

E i Crit Hi ^ E aw 

rk(N)=m 

< E *AW ( 4 -8) 

[A r ]GConjCls(F k .m) 
Wis CR 

where the final summation is over all conjugacy classes of subgroups of rank m in Fj., and for each 
class A is a CR representative. Moreover, we choose these representatives N so that if [Ni] and 
[-/V2] correspond the same non-pointed topological graph, the representatives Ni and N2 correspond 
to the same pointed topological graph A. 

Finally, split the summation of the CR representatives N by their topological graph A. By the 
latter assumption, we need only one representative of each conjugacy class of pointed topological 
graphs. By Claim l4~2l each such A has maximal degree at most 2m and at most 3m — 1 edges, so 
by Proposition ^. 1 31 the A's corresponding to each A contribute to the summation in (|4.8[) at most 

t- \V(Cl)\ • (4t 4 ) 3m_1 • (2m-l)*-p*. 

This finishes the proof of Theorem 14.81 as there is a finite number of "conjugacy classes" of A's of 
rank m. □ 

4.3 An Arbitrary Regular Base-Graph Cl 

We finish this section with the simple observation that when Cl is d-regular (but not necessarily the 
bouquet), the bounds from Corollary 14.51 apply. Firstly, from Lemma 14.91 it follows that tt (w) € 
{0, 1, 2, . . . , rk (0) , 00}. For all finite primitivity ranks the bounds we get are equivalent or better 
than those in Corollary 14. 51 

Corollary 4.14. Let Q be a finite, connected d-regular graph, and let m £ {0, 1, . . . , rk Then 

!2^/d~l 2m- 1 G [-l,Vd~ T] 

+ 2m - 1 2m - I E [y/d~ I, d - l] 
d 2m- 1 e [d-l,2rk(fi) - 1] 

Proof. First, for the words with 7T (w) — 0, that is, words reducing to 1, Claim [4~71 with its proof 
applies here too, so 

l/t 

= lim sup \cr° t (n)\ lft = 2Vd~T. 

#—^-00 



lim sup 



E l Crit H 



tueC7>j"(fi) 



lim sup 



E i Crit w 



22 



For m > 1, since the (extended) cogrowth formula (Theorem I4.4p applies here too, it is enough to 
prove that for reduced words we have: 



lim sup 



E l Crit HI 



w is reduced 



\fd~l 2m- 1 £ [1, T] 
^ 2m - 1 2m - 1 € [Vd - !, d-l] 
d-l 2m- 1 G [d- l,rk(0)] 



From the proof of Claim 14.121 we deduce that every critical subgroup is necessarily a subgroup 
of J v = it* (Cl v ) for some vertex v £ V (CI). As in the proof of Proposition 14. 3[ we denote by 
| iu | = t, w traces each edge 



vt (J) = 
bound: 



w e F k 



oiTx (J) at least twice 



for every J < F/., and as in (|4.1[) , we obtain the 



CritH|< £ E M J )' 

veV(fl) J<J„: rk(J)=m 



E 

w is reduced 

We carry the same counting argument as in the proof of Proposition 14.31 

• The first stage, where we count unlabeled and unoriented pointed graphs of a certain size and 
rank remains unchanged. 

• For the second stage of labeling and orienting the graph, we first choose v (\V (Cl)\ options), 
and then we use the fact that whenever J < J v , there is a core-graph morphism r\ : Tx (J) — > 
Cl v , which is, as always, an immersion (i.e. locally injective). So we first label an arbitrary 
edge incident to the basepoint £g>, and this one has to be labeled like one of the d edges incident 
with (8 at Cl v . We then label gradually edges adjacent to at least one already-labeled edge. 
Thus, the image of one of the endpoints of the current edge under the core-graph morphism 
is already known, and there are at most d — 1 options to label the current edge. Overall, the 
number of possible labelings is bounded by \V (Ct)\ ■ d(d— l) St 1 . 

• The third and last stage, where we estimate v t (J) for a particular J, is almost identical. The 
only difference is that every vertex in Tx (J) is of degree at most min{2m, d}, so overall we 



obtain v t (J) < Ut 1 



(min{2m,d} - If 



We conclude as in the proof of Proposition 



□ 



5 Controlling the Error of E [J 7 ^] 

In this section we establish the third step of the proofs of Theorems 11.11 and 11.41 as introduced in 
the overview of the proof (Section [2]). Recall that according to theorem 12.31 for every w £ Fk the 
following holds: 



•l7r(«j) — 1 I j*iz{w) 

But the O (•) term depends on w. Our goal here is to obtain a bound on the O (•) term, which 
depends solely on the length of w, namely a bound which is uniform on all words of a certain length. 
This is done in the following proposition: 

Proposition 5.1. Let w £ (lUl -1 )' satisfy n (w) ^ (so w does not reduce to 1). If n > t 2 

then 

E [-^.n] < 1 + f ICrit H| + 
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Achieving such a bound requires more elaborated details from the proof of Theorem 12. 3[ which 
appears in |PP12| . We therefore begin with recalling relevant concepts and results from |PP12| . 
We then present the proof of Proposition 15.11 in Section 15.51 

Before that, let us mention that the same statement holds for words in (X U A -1 ) that reduce 
to 1: 

Claim 5.2. Let w € (A U X -1 )* satisfy tt (w) = (so w reduces to I). If n > t 2 then 
E^n] < 1 + — 7 -_ hCritHI 



n n(,w)-l y J l n-t 2 

Proof. Recall that tt (w) = if and only if w = 1 as an element of F^. But then the only ui-critical 
subgroup is the trivial one, and so E [J- WJl ] = n — 1 + — ^- (|Crit (w) \ — — ) which is indeed less than 
the bound in the statement. □ 



5.1 The Partial Order "covers" 

In Section 13.21 morphisms of core graphs were discussed. A special role is played by surjective 
morphisms of core graphs: 

Definition 5.3. Let H < J < Fk- Whenever the morphism rjjj^j : Tx (H) —> Tx (J) is surjective, 
we say that Tx (H) covers Tx (J) or that Tx (J) is a quotient ofTx (H)- As for the groups, we 
say that H X -covers J and denote this by H <- J . H <^ J 

By "surjective" we mean surjective on both vertices and edges. Note that we use the term 
"covers" even though in general this is not a topological covering map (a morphism between core 
graphs is always locally injective at the vertices, but it need not be locally bijective). In contrast, 
the random graphs in C„_h are topological covering maps, and we reserve the term "coverings" for 
these. 

For instance, H = (xiX2Xi 3 , x 2 X2X^ 2 ) < Fk A-covers the group J = (x2, x 2 , x\X2X\), the 
corresponding core graphs of which are the leftmost and rightmost graphs in Figure I5TT1 As another 
example, a core graph T A-covers Tx (F&) (which is merely a wedge of k loops) if and only if it 
contains edges of all k labels. 

As implied by the notation, the relation H J indeed depends on the given basis A of Ffe. 
For example, if H = (X1X2) then H <j F2. However, for Y = {x±X2, X2}, H does not F-cover F2, 
as Ty (H) consists of a single vertex and a single loop and has no quotients apart from itself. 

It is easy to see that the relation " < 3?" indeed constitutes a partial ordering of the set of subgroups 
of Ffc. In fact, restricted to f.g. subgroups it becomes a locally- finite partial order, which means 
that if H <3j J then the interval of intermediate subgroups [H, J]^. = {M < F k \ H <^ M <y? J} 
is finite: 

Claim 5.4. If H < Fk is a f.g. subgroup then it X-covers only a finite number of groups. In 
particular, the partial order "<j" restricted to f.g. subgroups of Ffc is locally finite. 

Proof. The claim follows from the fact that Tx (H) is finite fClaim [3~2l fT|)) and thus has only finitely 
many quotients. Each quotient corresponds to a single group, by (|3.2p . □ 



5.2 Partitions and Quotients 

It is easy to see that a quotient Tx (J) of Tx (H) is determined by the partition it induces on the 
vertex set V (T x (H)) (the vertex-fibers of the morphism 77^^ j). However, not every partition P of 
V (Tx (H)) corresponds to a quotient core-graph. Indeed, A, the graph we obtain after merging the 
vertices grouped together in P, might not be a core-graph: two distinct j-edges may have the same 
origin or the same terminus. (For a combinatorial description of core-graphs sec e.g. [Pudll , Claim 
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2.1].) Then again, when a partition P of V (Tx (H)) yields a quotient which is not a core-graph, we 

can perform Stallings foldingf|3 until we obtain a core graph. We denote the resulting core-graph 

bjH r x(H)/p. Since Stallings foldings do not affect -rtf , this core graph r x(ff)/p is Tx (J), where r x(ff)/p 

J = tt* (A). The resulting partition P of V (Tx (H)) (the blocks of which are the fibers of Tj^^j) 

is the finest partition of V (Tx (H)) which gives a quotient core-graph and which is still coarser 

than P. We illustrate this in Figure I5TT1 




Figure 5.1: The left graph is the core graph Tx (H) of H = (x\x%x^[ , x^a^xj" 2 ) — ^2- vertices 
are denoted by v\, . . . , v±. The graph in the middle is the quotient corresponding to the partition 
P = {{vi, V4} , {1*2} j {^3}}- This is not a core graph as there are two 1-edges originating at {vi, 1)4}. 
In order to obtain a core quotient-graph, we use the Stallings folding process and identify these two 
1-edges and their termini. The resulting core graph, r x(-ff)/p, is shown on the right and corresponds 
to the partition P = {{i>i, 1)4} , {i>2, ^3}}- 

One can think of Tx (J) = Tx ( h )/p as the core graph "generated" from Tx (H) by the partition 
P. It is now natural to look for the "simplest" partition generating Tx (</)• Formally we introduce 
a measure for the complexity of partitions: if P C 2 X is a partition of some set X, let 

\\P\\**\X\-\P\= r £QB\-l). (5.1) 

BeP 

Namely, ||P|| is the number of elements in the set minus the number of blocks in the partition. 
For example, ||P|| = 1 iff P identifies only a single pair of elements. It is not hard to see that 
||P|| is also the minimal number of identifications one needs to make in X in order to obtain the 
equivalence relation P. Restricting to pairs of subgroups H, J with H < - J, we can define the 
following distance function: 

Definition 5.5. Let H, J <f g Fk be subgroups such that H <j J, and let T = Tx (H), A = Tx (J) 
be the corresponding core graphs. We define the X -distance between H and J, denoted px (H, J) px (H, J) 
or p (r, A) as 



Px (H, J) 



f PisapartitionofF(r x (F))\ 



For example, the rightmost core graph in Figure 15.11 is a quotient of the leftmost one, and the 
distance between them is 1. For a more geometric description of this distance function, as well 
more details and further examples, we refer the readers to |Pudll[ |PP12| . 

Of course, the distance function px (H, J) is computable. It turns out that it can also be used 
to determine whether H is a free factor of J: 



*A folding means merging two equally-labeled edges with the same origin or with the same terminus. See also 
Figure HTT1 For a fuller description of Stallings foldings we refer the reader to |Pudlll [PPT2] , 

■tin |PP12| . the notation r x( H )/p was used to denote something a bit different (the unfolded graph A). 



25 



Theorem 5.6. [\PudlVj .Th eorem 1.1 and Lemma 3.3] Let H,J<f g such that H <j J. Then 

rk(J)-rk{H) < px{H,J) < rk (J) . 

Most importantly, the minimum is obtained (namely, rk (J) — rk (H) = px (H, J) ) if and only if H 
is a free factor of J. 

This theorem is used, in particular, in the proof in |PP12j of Theorem 12.31 

So far the partitions considered here were partitions of the vertex set 
V (Tx {H)). However, it is also possible to identify (merge) different edges in Fx (H), as long 
as they share the same label, and then, as before, perform the folding process to obtain a valid 
core graph. Moreover, it is possible to consider several partitions Pi, . . . , P r , each one either of the 
vertices or of the edges of (H), identify vertices and edges according to these partitions and then 
fold. We denote the resulting core graph by r x(H)/(p 1< ...,p r ). It is easy to see that one can incor- r ^( H )/{p 1 ,...,p r ) 
porate this more involved definition into the definition of the distance function px (H, J), because, 
for instance, identifying two edges has the same effect as identifying their origins (or termini). In 
fact, the following holds: 

Pi : a partition of V (T x (#)) or of E (T x (H)) 
s.t. rx(H)/ (Pl) ... !Fr) = r x (J) 



px(H, J)=min<M|P 1 || + ... + ||P r 



(5.3) 



5.3 Prom Random elements of S n to Random Subgroups 



Recall that Theorem 12.31 estimates E [J-^n], the expected number of fixed points of w (a±, . . . , erfc), 
where o~i , . . . , G S n are chosen independently at random in uniform distribution. The first step 
in the proof consists of a generalization of the problem to subgroups: 

For every f.g. subgroups H < J < let aj s„ : J — > S n be a random homomorphism chosen 
at uniform distribution (there are exactly ISVil*^ such homomorphisms) . Then a,j_s„ {H) is a 
random subgroup of S n , and we count the number of common fixed points of this subgroup, namely 
the number of elements in {1, . . . , n} fixed by all permutations in ctj,s„ (H). Formally, we define &h,j 

(n) ^ E || -r p n oi„ts (*J,s n (H))\ . 

This indeed generalizes E [Fw^ for 

E [T w<n ] = $( w ), Fk (n) . (5.4) 



5.4 Mobius Inversions 

The theory of Mobius inversions applies to every poset (partially ordered set) with a locally-finite 

def 

order (recall that an order ^ is locally-finite if for every x, y with x ■< y, the interval [x, y]^ = 
{z | x ■< z ^ y} is finite). Here we skip the general definition and define these inversions directly in 
the special case of interest (for a more general point of view see |PP12| ). 

In our case, the poset in consideration is S — {H < F^ | H is f.g.}, and the partial order is <j, 
which is indeed locally-finite (Claim l5T|) . We define three derivations of the function (f> defined 
in Section I5~51 the left one (L), the right one (R) and the two-sided one (C). These are usually 
formally defined by convolution of $ with the Mobius function of S (see |PP12| ) but here we define 
them in an equivalent simple way: these are the functions satisfying, for every H <j J, 

^h,.j (n) = ^2 w = Cm > n ( n ) = Rh < n ( n ) ■ ( 5 - 5 ) 

Me[H,J]^, M,N:H<3lM<^N<^J N£[H,J}^, 

x x 

Note that the summations in (|5 . 5[) are well defined because the order is locally finite. To see that 
(|5.5p can indeed serve as the definition for the three new functions, use induction on \ [H, J)\: for 
example, for any H < ^ J, Lh,j (n) = $h.j (n) — Y^m&[h J) Lm,j (n) and all pairs (M, J) on the 

r.h.s. satisfy \[M,J]\ <\[H,J}\. 
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With all this defined, we can state the main proposi- 
tions along the proof of the main result in |PP12| . 

Proposition 5.7 ( |PP12| . Proposition 5.1). The func- 
tion R is supported on algebraic extensions. 

Namely, if J is not an algebraic extension of H, then 
Rh.j (n) = for every n. Since, if H < a i g J then H <- J 
(e.g. |PP12[ Claim 4.2]), we obtain that 




H,J 



(n) 



E 

N:H< al N<J 



Rh,n (n) 



(5.6) 



Next, &h,j (n) is given a geometric interpretation: it turns out it is equal to the expected number 
of lifts of t)h->j ■ Tx (H) — > Tx (J) to a random n-covering of Fx (J) in the model C„.r Y (./) jPP12| 
Lem. 6.2]. Similarly, Lu,j{n) counts the average number of injective lifts [PP121 Lem. 6.3]. For 
given H and J, it is not hard to come up with an exact rational expression in n for the expected 
number of injective lifts, i.e. of Lh.j (n), for large enough n (in fact, n > \E {Tx {H))\ suffices, see 
|PP12[ Lem. 6.4]) . As the other three functions ($, R and C) are obtained via the addition and 
subtraction of a finite number of Lm,j (n)'s, we get 

Claim 5.8. Let H, J < F k be f.g. subgroups such that H <- J. Then for n > \E (Tx (H))\, the 
functions <&h,j {n), Lh.j (n), Rh.j (n) and Ch.j {n) can all be expressed as rational expressions in 



After some involved combinatorial arguments, one obtains from this the following expression 
for Cm,n {n): Denote by Sym (S) the set of permutations of a given set S. Every permutation 
a E Sym (S) defines, in particular, a partition on S whose blocks are the cycles of a. By abuse of 
notation we denote by a both the permutation and the corresponding partition. For instance, one 
can consider its "norm" ||cr|| (see (|5 .If) ; this is also the minimal length of a product of transpositions 
that gives the permutation a). We also use Vm and Em as short for V (Tx (M)) and E (Fx {M)), 
respectively. 

Proposition 5.9 ( |PP12j . Section 7.1). Let M,N < F k be f.g. subgroups such that M N. 
Consider the set 



M.N 



(cr , en, ■ • ■ , cr r 



r e N, <to 6 Sym(V M ) 
cti, . . . , a T e Sym {Em) \ {id} 
rx(M)/ <CTOiCTli ... :<7r> = T x (N) 



Then 



C 



M,N 



{n) 



1 



rk(M)-l 



E (-i) r -(- 

■ • • , cr r ) £ 7~m n 



El 



The way from Theorem 15.61 and Propositions 15.71 and 15.91 to proving the main result of |PP12| 
(Theorem I2.3P is short: see the beginning of Section 7 in |PP12| . 



V M , E 



5.5 Proving the Uniform Bound for the Error Term 

We now have all the tools required for proving Proposition 15.11 Namely, we now prove that every 
1 w G Fk of length t and every n > t 2 , 



E [T w ,n] < 1 



1 



i7r(tu)— 1 



Crit {w)\ 



-t 2 



(Note that we lose nothing by passing to reduced words. Reducing an element of {X U A 1 ) does 
not affect E [J-^^], and only tightens the upper bound.) 
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Proof, [of Proposition 15. 1| Recall (Section I5.3[) that EfJ^n] = &( w ) y F k ( n ) an d this quantity is 
given by some rational expression in n (for large enough n, say n > \w\, see Claim [5T5|) . This 
expression can be expressed as a Taylor series in — , so write 



E [J^ in ] = 



a s (w) 



where a s (w) G R (in fact these are integers: see |Pudlll Claim 5.1] and also the sequel of this 
proof). By Theorem 12.31 oq = 1, ai = ai = . . . = a 7r ( u ,)_ 2 = and a„t w )-i = |Crit(w)| (unless 
7r (w) = 1 in which case ao = 1 + |Crit (w)\). So our goal here is to bound the remaining coefficients 
a s (w) for s > 7r (w). 

The discussion in Section HT4l yields the following equations: 

E^.n] = $( w ),F k ( n )= E R(w),N{n) = 

N:(w)< alg N<F k 

= e c m,n ( n ) = e x ( n ) 

M,JV: (w)<xM<%N M: (w)<^M N: M<^N 



From Proposition 15.91 we obtain that for a fixed M 



" VV n rk(Af) 

For every q > define the following set: 



iEH)' 



E 



cr eSym(V2Vf) 
cr 1 ,....cr r GSym(_E„)\{id} 



Pm,, 



((t , • • • ,oy) 



r G N, (T G Sym(V M ) 
cri,...,(T r G Sym(£: M )\{id} 
||(T || + . . . + ||(T r || = g 



(5.7) 



so that 



Hence, 



e ^(nH-j^x 1 -^ x: (-d 



N:M<?N 



n rk(M)-l n g 

9=0 (<T O ,...,0V)e'PM,<! 



s+1 



«.(»)=£ E (-ir^ E 

(cr ,...,<T r )e'PM,s-(i-l) 



(5- 



i=l A'/: (i»)<jM 
rk(Af)=t 

In what follows we ignore the alternating signs of the summands in 15.81 and bound a s (w) by 



s+l 



a s (w) < X E 



M,a-(i-l) | 



(5.9) 



rk(A/)=i 



Claim: |?Vg| < t 2 «. 

Proof of Claim: Fix M and denote b q = \PM,q\- Clearly, bo = 1, and we proceed by induction on 
q. Let q > 1. We split the set Pm,? by the value of ay. For r — there are at most 



|{<7eSym(y A/ )|H = <?}| < 



< 



q t 2q 

< — 
- 2i 



elements with r = 0. (For the middle inequality note that \Vm\ < |^0)| *> this is also the case 
for the edges: \Em\ < |-^<to)| For r > 1, cr r is a permutation of the set of edges Em and 
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given ay, the number of options for do, ... , oy_i is exactly iicr^n ■ By the induction hypothesis we 
obtain: 

t 2q t 2q q 

b g < E b q ^ aA = — + Y / b i- a \W £Sy™ (Em) \\W\\ =a}\ 

cr r £Syni(EM)\{id} «=1 
j-2ij 1 -tla 

< — + Yt 2q ~ 2a — = t 2 i. □ 

We proceed with the proof of the proposition. For a given w € (X U X -1 )' there are at most 

(l y <™>l) < (J^j 13 partitions of norm j3 of Vj^, and so at most (*)' 3 subgroups M of rank /3 with 
(w) <3? M (see Theorem l5.6[) . Hence from f|5.9f) we obtain, 

a s h < ^ r ) t 2 ^- 1 " < E V ' * 2(s_l+1) ^ * 2s+2 - 



Finally, 



r 1 1 |Crit(w)| _ ^ OsH 



s— 7r(u;) 

< y - — = t 2 - - 

s— 7r(u;) 



71 -t 2 

This finishes the proof. □ 



6 Completing the Proof for Regular Graphs 

In this section we complete the proofs of Theorems 11.11 and 11.51 In addition, we explain (in Section 
16. ip the source of the gap between these results on the one hand and Friedman's result and Con- 
jecture [T31 on the other. We begin with Theorem ll.il Recall that we need to prove that for d even, 
a random d-regular graph T on n vertices in the permutation model (a random n-covering of the 
bouquet) satisfies A (r) < 2y 'd — 1 + 0.84 a.a.s., where A (r) is the largest non-trivial eigenvalue of 
At- 

So let d = 2k and n,t = t (n) be such that n > t 2 and t is even. The base graph is now the 
bouquet over k loops, so CV t (0) = (X U X" 1 ) By ([221), Proposition IO and Claim IS~2l 

A (If] < (E[^]-l) = 

- t £ f™ + o^ 



m = w<£(XUX- 1 ) t : 
■K(w)—m 

k 



< E^T E ( ICrit (w)\ + - 



71' 

m=0 w<£(XUX- 1 ) t : 



i2+2m 



-t 2 



< 1 



E^T E ICrit («;)| 



n — t 2 ) ' — ' n" 

Tn=0 t«e(xux- 1 ) t : 
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Let e > 0. For m £ {0, 1, . . . , fc}, Lemma l4~5l (for m > 1) and Claim 14771 (for m = 0) yield that for 
large enough t, 

|CritH| < [g(2m-l)+e] t , 

we(xux- 1 ) t -. 

7r(u?)— m 

where g (•) is defined as in (|4.4|) : 



ff (2m-l) = ( 2 ^ 



Thus 



E 



A(r)< 



< 



< 



im 



j.2+2k 



d-l 



2m 
2m 



1 < Vd^T 
1 > ^/d~l 



1 



£ 2 



2 + 2A- 



[g (2m - 1) + e]< 



E 

m=0 



i m — 1 



t 2 + 



2m 



-t 2 



n - t 2 



(fc + 1) • 



nV* k (o) + e ], ff (l) +e , . 

max < g(2fe-3)+e 



(6.1) 



Recall that F is a random graph on n vertices. In order to obtain the best bound, t needs to 
be chosen to minimize the maximal summand in the r.h.s. of (|6.ip . This requires t = 9(\ogn): 
if t is larger than that, the last elements are unbounded, and if t is smaller than that, the first 

element is unbounded. Thus, in particular, ^1 + t r ^_ t -i j = 1 + o n (1). We show that for every 

d there is a constant c(d), such that if t is chosen so that n 1 /* « c, then all fc + 1 elements in 
the set are strictly less than 2\J d — 1 + 0.835 (for small enough e). Thus, for large enough t, 

A standard application of Markov's inequality then shows that 
► 1. 



A (F)* < [2Vd~l + 0.835] 
Prob [A (r) < 2^d~T + 0.84] 



Indeed, for d > 26, one can set n 

5 



l/t 



e 5Vd=T < 1 
2y/d~ T(l - 



so the element corresponding to m 



\2Jd-\ 



Simple analysis shows that for d > 26, 
is at most 2-\/d — 1 ■ e r ^ d - x < 

= 2\J d — 1 + |. This first element is clearly larger than all other elements 

corresponding to m such that 2m — 1 < \fd— 1. Among all other values of m, the maximal element 
is obtained when 2m -1» 4.55%/d — 1, but its value is bounded from above by 1.94\/d — 1 + 0.4 
(again, by simple analysis). For all remaining cfs (4, 6, ... , 24), it can be checked case by case that 
choosing n 1 /* so that n 1 /* ■ 2\J d — 1 = 2\/rf — 1 + 0.8 works. □ 

Proof of Theorem II. 5t The only change upon the previous case is that the summation in (|6.ip 
over the primitivity rank m does not stop at k = | but continues until rk (fi) = |V (f2)| (| — l) + 1. 
However, when m > fc, it follows from Corollary 14. 141 that the corresponding term inside the max 
operator is 



^rr which is strictly less than 



— (for every choice of t and n), but this 



(„i/ t ) — i — - — »v — — („i/*) d/2 - 
latter term is already there in (|6.ip . Thus, the maximal term is remained unchanged, and we obtain 
the same bound overall as in Theorem ll.il □ 



6.1 The Source of the Gap and Other Remarks 

It could be desirable to use the approach presented in this paper and replace the constant 1 in 
Theorem 11.11 with arbitrary e > 0, to obtain Friedman's tight result. Unfortunately, this is still 
beyond our reach. It is possible, however, to point out the source of the gap and how it may be 
potentially overcome. 
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Remark 6.1. Why t has to grow faster than 9 (logn). Denote p = 2y 'd — 1. Recall from (|2.1jl 

r E A (F)' actually bounds 



that the bound used fo 



E 



EAeSpcc(A r )\{ci} ^* ■ ^ ^ S known (e.g. |GZ99[ Corr. 1]) that for every 6 > there exists 



< e < 1 such that at least e-nof the eigenvalues satisfy |A| > p — S. If t <E O (logn) then n 1 /* > c 
for some constant c > 1. But then X^AeSpcc(Ar)\{d} 

A* > en(p- 6)* > [eV* c (p - 5)] . Thus, the 

r r *i l 1/( • 

upper bound obtained for < E A (r) > is at least c • p which is strictly greater than p. 

Remark 6.2. The gap in the proof and how it may be overcome. Most of the steps in 
the proof are tight, in the sense that the bounds we get are the right, actual values. Indeed, in 
the first step (if we follow the structure of the proof as explained in Section [5]) , the only strict 
inequality comes from the fact that we bound A (T)* by SAeSpec(A r )\{<i} ^ u t if ^ ^ l°g n the 
two become equal as n — > oo. The second step, which relies on Theorem 12.31 has only equalities 
so it is surely tight. In the fourth step, the bound we have for the exponential growth rate of 
X^epfux- 1 )*- Tr(w)=m I Grit (w)\ is also the correct value (see Theorem I8.5[) . But if t logn, the 
last k — 2 elements in the r.h.s. of (|6.ip are unbounded. So what is the source of this incorrect 
bound? 

The gap has to come from the third step, where we analyze the error term in the expression 
for E [J- W ,n\. This is the only place in the proof which is wasteful. During the proof of Proposition 
15.11 the coefficients a s (w) are bounded. An exact expression for these is given in (|5.8[) . but then 
we ignore the changing signs of the terms and obtain a somewhat wasteful bound. Moreover, 
Proposition 15 . 71 yields that certain subsets of the terms in (|5.8[) offset each other, yet this is ignored 
in the proof. We believe that this is where the gap between our bound and the tight bound can 
be overcome. In fact, is seems that many of the a s (iu)'s, including the first one, a w t w \ (w), are 
often negative, and that error term, X^ttO) ° „^ > actually tends to be negative. In other words, 

E [.Fuj,™] is smaller than 1 + ^ C T "5)-i ; and substantially smaller when \w\ is large compared to logn. 
This would allow us to let t grow faster than 9 (log n) , while obtaining better bounds for the k — 1 
last elements in (16.11). 



Remark 6.3. The optimal bounds for small cTs: The smallest constant c for which a bound 
of 2y 'd — 1 + c can be obtained here (for all d's) is about 0.8. For small values of d, how- 
ever, there are better bounds. Consider for example the case d = 4. In order to minimize 
maxjn 1 /* • 2\/d — 1, 2y/d — 1, ^t/i} choose n 1 /* = 2 ^/d~i ^° ^ an u PP er bound of 3.723 (com- 
pared with 2\/d— 1 = 3.464, so here c w 0.259). Similarly, for d — 3 (relevant for Theorem 1 1 . 5j) 
one can obtain an upper bound of ^/3 • p ~ 2.913. 

Remark 6.4. Odd <i's. Another gap in our approach concerns odd values of d. At the moment, we 
are unable to to extend our proof to this case. One plausible direction is as follows. By |Wor991 
Cor. 4.17] and |GJKW021 Thm 1.3], the buckets model! Q* n d for d odd is contiguous to the model 
where we take k = random permutations plus one random perfect matching on the n verticetd. 
(When d is odd, n must be even.) If we label the edges corresponding to the perfect matching by 
b, and orient the edges corresponding to the permutations and label them by a\, . . . , a&, the graphs 
become Schreier graphs of subgroups of * Z /2Z = (a\, . . . , <Xfc, b | b 2 = l). It is possible that the 
machinery we developed for the free group (and especially, Theorem I2.3|) can be also developed for 
this kind of free products. 



tSee the end of Section [T] 

^However, not every d-regular graph when d is odd belongs to this model: some graphs may not contain a perfect 
matching. This is in contrast to the even case, where every (i-regular can be obtained in the permutation model. 
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7 Completing the Proof for Arbitrary Graphs 



The completion of the proof of Theorem 11.41 is presented in this Section. We begin with the proof 
of the first statement of the theorem which concerns the spectrum of the adjacency operator of T, 
the random n-covering of the fixed base graph il. The variations needed in order to establish the 
statement about the Markovian operator are described in Section 17.11 

Recall that p = pa (0) denotes the spectral radius of the adjacency operator of the covering tree. 
Our goal now is to prove that for every e > 0, (T), the largest absolute value of a non-trivial 
eigenvalue of the adjacency operator Ay, satisfies asymptotically almost surely 



\ A (r) < V3 ■ p + e. 



(7.1) 



As in the proof of Theorem 1 1 . 1 1 (the beginning of Section |6|), let n,t = t (n) be so that n > t 2 and t 
is even. Using (|2.2p , Proposition 15. 1[ Claim 15.21 and Lemma 14.91 one obtains 



3 



Aa (It 



< 



< 



f 2+2rk(fi) \ rk (°) 



1 



t 2 



I Z ^ rnTYl—1 / .j 



|Crit (w)\ 



2 



m=0 " w£CV^(Q) 

Let e > 0. From Theorem l4.8l and Lemma [4.101 it follows now that for t large enough, 

^2+2rk(Sl) ■ 

j n-[p + eY + 

- ) (l + rk(fi))- 
3p 



(r)' 



< 



< 



i 



i 



n - 1 2 

^2+2rk(Sl) \ 



_ e]t+ ^ [(2m. 



t 771 — 1 



max < n 



~t 2 
1/4 [P 



bp- 



i 1 /* 



(2rk(n)- l)p4 



(7.2) 



Again, to obtain a bound we must have t <= 0(logn), and the best bound we can obtain in this 



t 2 + 2rk(fi) 



1/' 



1, and the maximal 



general case is obtained by choosing n 1 /* w \/3 , so ^1 

value inside the set in (|7.2p is then \/3 (p + e). Again, a standard application of Markov inequality 
finishes the proof. □ 



7.1 The Spectrum of the Markov Operator 

After establishing the first statement of Theorem II. 4| we want to explain how the proof should be 
modified to apply to Am (r), the maximal absolute value of a non-trivial eigenvalue of the Markov 
operator on V. The goal then is to show that for every e > 

Am (T) < V3 ■ p M (n) + e (7.3) 

asymptotically almost surely. 

As we note in Appendix[B] the Markovian operator is given by BrD^ 1 , where Br is the adjacency 
matrix and £>r the diagonal matrix with the degrees of vertices in the diagonal. This is conjugate 

1/2 1/2 

to and thus share the same spectrum with Qr = D T B^D T , but the latter has the advantage 
of being symmetric, so we work with it. 

The (u, v) entry of Qr equals , 1 times the number of edges between u and v. For 

Vdeg(u) deg(u) 

every path w in T we assign a weight function / (w) as follows: if w starts at vq, then visits 
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Vi,V2, ■ ■ ■ , vt-i and ends at v t . then f (w) = , , , ^ , , It is easy to see that 

[Qp] u equals the sum of / over all paths w of length i from u to v, and thus 



E ^ 

AeSpec(Mr) 



trA/r = 



Moreover, note that when a path from the covering T is projected to the base graph £7, its weight 
does not change. Using this equality, we can imitate step I from Section [5] to obtain 

xm(tY < E E xt = E /h - E /h = 

AGSpoc(J\/ r ) A6Spcc(A/ r ) weCP t (r) ™eC"P t (fi) 

The second and third steps remain the same, to obtain 



< E 



Am (IT 



< 1 



^ ^ ^m— 1 



f 2 



E i»l Crit (< 

weepy* (n) 



Hence, the next modification needs take place in the fourth step, where instead of bounding 

T,weCP t (Sl):Tv(w)=m l Crit HI' ° ne needs t0 b0Ulld Eu,gC-p t (n):-r(«;)=m / N l Crit ( W )l But the exact 

same proofs work if we just replace pa (f2) with pm (fJ). Theorem 14.81 becomes 



lim sup 

t— >oo 



E /(w)icrit(t 

iuec-P7'(n) 
and likewise, Lemma 14.101 becomes 



< (2m - 1) • pAf («) 



(7.4) 



lim sup 



i/t 



2 /M =PAf(n). 

_wecp°(n) 

Similarly, the definition of f3 t (N) (before Claim I4.11j) should be modified to 

AW= E 

tueCPt(O): (w)<. a i g N 

and in the proof of Claim 14.111 one should use the fact that / (w) does not change when the closed 
path w is being cyclically rotated. Finally, in the proof of Proposition 14. 13l we sometimes replace 
a path with its inverse and use the symmetry of the operator. This is the reason for working with 
Qt rather than with Mr. (The coefficient |V(f2)| from the statement of the proposition needs be 
replaced with some constant function of the degrees of all vertices.) 

Because the bounds in (|7.4p are exactly those in Theorem l4.8l onlv with pm (0) instead of pa (0), 
the final step of the proof (which appears in Section [7]) can be imitated one-to-one. 



8 The Distribution of Primitivity Ranks 

In this subsection we show that most of the upper bounds from Proposition 14.31 and all the up- 
per bounds from Corollary 14.51 are the accurate exponential growth rates of the number of words 
(reduced or not) and critical subgroups with a given primitivity rank. This is not needed for the 
proof of the main results of this paper. However, it does show that in the proof of Theorem 11.11 
the fourth step of the proof, where words and critical subgroups are counted, yields a tight bound. 
Thus, the origin of the gap between our result and Friedman's lies elsewhere (see Section UTTj) . 
First, let us recall a theorem due to the author and Wu which counts primitive words in P/.. 



33 



Theorem 8.1. fPWlSy For every k > 3 , let pk (t) denote the number of primitive words of length 

t in Pfc . Then, 

lim {/p k (t) = 2k-3. 

t— ¥ OO 

For F2 it is known that this exponential growth rate equals s/3 ( |MS03j ). These results shows 
that the portion of primitive words among all words of length t decays exponentially faslQ. It allows 
us to prove that most of the upper bounds from Proposition I4.3I are accurate. In fact, we believe 
all the upper bounds from this proposition are accurate, see Remark I8.4I below. 



Theorem 8.2. Let k > 2 and m £ {1, 2, . . . , k}. Let Ck, m (i) 

Cfe, m (t) = \{w E F fe I \w\ =t,TT (w) = m}\ . 
Then, for every m satisfying 2m — 1 > y/2k — 1, 

limsupc fc . m (£) 1/4 = lim c k _ rn (t) 1/l = 2m - 1. (8.1) 

Corollary 8.3. A generic word in Ff. has primitivity rank k. 

Proof, [of Theorem 18. 2| The r.h.s. of (|8.1)1 is an upper bound for the lim sup by Proposition 14.31 
So it remains to show that for every m as in the statement, there is some subset of words with 
primitivity rank m and growth rate 2m — 1 . 

So assume that 2m — 1 > \/2k — 1. Take any subset S C X of size m and consider the subgroup 
H = F(S). Its core graph is a bouquet of m loops. The number of words of length t in H is 
2m ■ (2m — 1)' . By Theorem 18. 1[ a random word in H of length t is a.a.s. non-primitive in H, 
so its primitivity rank is at most m. On the other hand, the exponential growth rate of all words 
with 7r (w) < m combined is smaller then (2m— 1) (by Proposition 14. 3p . Thus, a word w <G H of 
length t satisfies ir (w) = m a.a.s, and we are done. □ 



Remark 8.4. As aforementioned, we believe that the bounds from Proposition ^. 31 are accurate for 
all values of m. More precisely, it seems that for m = 1 



limsupc fejm (t) 1/l = lim c fc>m (t) 1/l = \j2k - 1, 

and for m > 2, 



/ even 



r ^V* r \V2k^l 2m-l<V2k^l . . 

hmsupc fe , m (t) = hmc fc>m (i) / =< _ _. (8.3) 

t-s-oo 12m — 1 2m— 1>v2k— 1 

(We single out the case m = 1 because the lim sup equals the limit only on even values of t.) It 
follows from the proof of Proposition 14.31 that while for 2m — 1 > \j2k — 1 the main source for 
words with tt (w) — m is subgroups with core graphs of minimal size (and their conjugates), the 
main source for 2m — 1 < \j2k — 1 is subgroups with core graphs of maximal size, namely of size 
roughly |. 

For each m found below the borderline, consider the subgroups of the form H = (xi , . . . , x m -\ , u) 
where u is a cyclically reduced word of length ~ | such that its first and last letter are not one of 
{x\, . . . ,x m -i}. Then, Tx (H) has the form of a bouquet of m — 1 small loops of size 1 and one 
large loop of size ~ 4. Now consider the words w = xfx^ ■ ■ ■ x m _\v? and w' = x^x^ ■ ■ ■ x m J\u 2 
(for m — 1 we have just w = u 2 ). Obviously, the growth rate of the number of possible u's is 
\j2k — 1, thus also the growth rate of the number of different tu's and w"s. It can be shown that 
both w and w' are not primitive in H, using the primitivity criterion from Theorem 15.61 ( |Pudll[ 



tThat primitive words in F fe are negligible in this sense follows from the earlier result BMS02, Thin f0.4], where 
the exponential growth rate from Theorem 18. H is shown to be < 2k — 2. 



34 



Thm 1.1]). It seems that for generic u, H is also of minimal rank with this property, i.e. a uncritical 
subgroup, in which case ir (w) = tt (w 1 ) = m. Note that for m > 2 we presented words of lengths 
both even and odd. 

Finally, let us mention that for m = 1,2, the statement is indeed true: for every even t and 
every cyclically-reduced word u of length 4, the word u 1 is indeed of primitivity rank 1. For m = 2, 
it can be shown that the above-mentioned words w and w' are non-powers for every choice of u 
satisfying the conditions mentioned above. Thus, indeed 7r (w) = tt (w') = 2. 

Recall that in the proof of Theorem 1 1.1 1 we used bounds on the number of not-necessarily-reduced 
words (and their critical subgroups). In this case, it can be shown that the bounds from Corollary 
14.51 are accurate for every value of m: 

Theorem 8.5. Let k > 2 and m G {0,1,2, ... ,k, oo}. Let bk, m {t) 

b k ,m(t)= (toeflUX- 1 )' 7r(u>) = m} . 



Then for m = we have 
For m £ {1, . . . , k}, 

lim bk, m (t) 

t— > oo 

Finally, for m = oo we have 



lim 6 fc ,o (t) 1 = 2V2k-l. 

t even 



f2\/2fc 1 2m - 1 < V2fc - 1 

\2m-l + ^Er 2m - 1 > V2fc - 1 



lim 6 fej00 (i) Vt = 2fc - 2 + -. 

t-y<x> Ik — 6 

This shows, in particular, that as in the case of reduced words, a generic word in [X U X -1 )' 
is of primitivity rank k, namely, the share of words with this property tends to 1 as t — > oo. Notice 
also that this shows that for any m, the growth rate of the number of words with primitivity rank 
m is equal to the growth rate of the larger quantity of E t ue(xux- 1 ) t '?r(ni)=m |Crit (w)\. 

Proof. For m — this is (the proof of) Claim l4~7l (evidently, there are no odd-length words reducing 
to 1). For 1 < m with 2m — 1 < \j2k — 1 the same proof (as in Claim |4~7| can be followed as long 
as we present at least one even-length and one odd-length words with primitivity rank m. And 
indeed, as mentioned above (and see |Pudll[ Lemma 6.8]), n (xfx^ ■ ■ ■ x? n ) = n (xfx% . . . x^) = n%. 



If 2to — 1 > \j2k — 1, the statement follows from the statements on reduced words (Theorems 18.2 
and 18. ip and an application of the extended cogrowth formula (Theorem 14. 4p . □ 

The result of the last theorem are summarized in Table [TJ 



9 Open Questions 

We end with some open problems that suggest themselves from this paper: 

• Can one show that the error term in Theorem 12.31 is indeed negative and relatively large (in 
absolute value with respect to ) when \w\ logn? This would require improving the 
analysis in the proof of Proposition 15. II (see Remark |6.2[) . 

• Is it possible to generalize the techniques in this paper (and even more so the ones from 
|PP12| ) to odd values of d? (See Remark jOJ- 
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• Can one obtain the accurate exponential growth rate of the number of not-necessarily-reduced 
words with a given primitivity rank in a general base graph fi, thus improving the statement 
of Theorem 14.81 " This may require a further extension of the cogrowth formula that applies 
to non-regular graphs (there have been a few attempts in this aim, see e.g. |Bar991 INor04[ 
IAFH07| ). 

• Several classic results from the theory of expansion in graphs were generalized lately to simpli- 
cial complexes of dimension greater than one (see e.g. |GW12]fPRT12]rLubl3| ). In particular, 
a parallel of Alon-Boppana Theorem is presented in |PR12| . Is there a parallel to Alon's con- 
jecture in this case? Can the methods of the current paper be extended to higher dimensions? 
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Appendices 

A Contiguity and Related Models of Random Graphs 
Random d-regular graphs 

In this paper, the statement of Theorem 11.11 is first proved for the permutation model of ran- 
dom d-regular graphs. We then derive Theorem II. 1[ stated for the uniform distribution on all 
d-regular simple graphs on n vertices with d even, using results of Wormald |Wor99| and Greenhill 
et al. [GJKW02] . These works show the contiguity (see footnote on page ^ of different models for 
random regular graphs. 

In particular, they describe the following model: consider dn labeled points, with d points in 
each of n buckets, and take a random perfect matching of the points. Letting the buckets be 
vertices and each pair represent an edge, one obtains a random d-regular graph. This model is 
denoted Q* n d . It is shown [GJKW02, Thm. 1.3] that Q* d is contiguous to the permutation model 
V n ,d- If r is a random d-regular graph in Q* n d , the event that T is a simple graph (with no loops 
nor multiple edges) has positive probability, bounded away from 0. Moreover, within this event, 
simple graphs are distributed uniformlj|l|. Thus, Theorem 11.11 follows from the corresponding result 
for the permutation model. 

Random d-regular bipartite graphs 

As an immediate corollary from Theorem 11.51 we deduced that a random d-regular bipartite graph 
is "nearly Ramanujan" in the sense that besides its two trivial eigenvalues ±d, all other eigenvalues 
are at most 1\Jd — 1 + 0.84 in absolute value a.a.s. (Corollarv ll.6p . Our proof works in the model 
Cn.fi (here Q is a graphs with 2 vertices and d parallel edges connecting them). However, by the 
results of |Ben74| , the probability that our graph has no multiple edges is bounded away from zero 
(asymptotically it is l/eW). Thus, our result applies also to the model of d random disjoint perfect 
matchings between two sets of n vertices. This model, in turn, is contiguous to the uniform model 

'To be precise, vertex-labeled simple graphs are distributed uniformly in this event. Unlabeled simple graphs have 
probability proportional to the order of their automorphism group. Then again, for d > 3, this group is a.a.s. trivial, 
so the result of Theorem ll.il applies both to the uniform model of labeled graph and to the uniform model of unlabeled 
graphs. 
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of bipartite (vertex-labeled) d-regular simple graphs (for d > 3: see |MRRW97[ Section 40), so our 
result applies in the latter model as well. 

Random coverings of a fixed graph 

In Theorem 11.41 we consider random n-coverings of a fixed graph fl in the model C n ^, where a 
uniform random permutation is generated for every edge of fl. An equivalent model is attained 
if we cover some spanning tree of ft by n disjoint copies and then choose a random permutation 
for every edge outside the tree (that is, the same automorphism-types of non-labeled graphs are 
obtained with the same distribution). In fact, picking a basepoint ® <G V (f2), there is yet another 
description for this model: The classification of n-sheeted coverings of f2 by the action of 7Ti (f2, <g>) 
on the fiber {®} x [n] above £§> shows that C n> Q is equivalent to choosing uniformly at random an 
action of the free group 7Ti (fi,<g>) on {(X)} x [n]. 

A different but related model uses the classification of connected, pointed coverings of (f2, (g>) 
by the corresponding subgroups of m (fi, ®). A random n-covering is thus generated by choosing 
a random subgroup of index n. However, it seems that this model is contiguous to C n ,n restricted 
to connected graphs (note that the random covering T in C n ,n is a.a.s. connected provided that 
rk (SI) > 2). Indeed, the only difference is that in the new model, the probability of every connected 
graph r from C n ,n is proportional to |A u t(r)| ' When rk (fl) > 2, it seems that a.a.s. |Aut (T)\ = 1, 
which would show that our result apply to this model as well. 

Finally, there is another natural model that comes to mind: given a periodic infinite tree, 
namely a tree that covers some finite graph, one can consider a random (simple) graph T with n 
vertices covered by this tree (with uniform distribution among all such graphs with n vertices, for 
suitable n's only). One can then analyze A (r), the largest absolute value of an eigenvalue besides 
pf(r). (This generalizes the uniform model on d-regular graphs.) Occasionally, all the quotients 
of some given periodic tree T cover the same finite "minimal" graph Q. Interestingly, Lubotzky 
and Nagnibeda fLN98j showed that there exist such T's with a minimal quotient CI which is not 
Ramanujan (in the sense that A (f2) is strictly larger than p (T), the spectral radius of T). Since all 
the quotients of T inherit the eigenvalues of CI, their A (•) is also bounded away from p (T) (from 
above). Hence, the corresponding version of Conjecture 11.31 is false in this general setting. 

B Spectral Expansion of Non-Regular Graphs 

In this section we provide some background on the theory of expansion of irregular graphs, describing 
how spectral expansion is related to other measurements of expansion (combinatorial expansion, 
random walks and mixing) . This further motivates the claim that Theorem 11.41 shows that if the 
base graph CI is a good (nearly optimal) expander, then a.a.s. so are its random coverings. We 
would like to thank Ori Parzanchevski for his valuable assistance in writing this appendix. 

The spectral expansion of a (non-regular) graph V on m vertices is measured by some function on 
its spectrum, and most commonly by the spectral gap: the difference between the largest eigenvalue 
and the second largest. As mentioned above, it is not apriori clear which operator best describes 
in spectral terms the properties of the graph. There are three main candidates (see, e.g. |GW12j ). 
all of which are bounded^], self-adjoint operators and so have real spectrum: 

tin fact, there is an explicit proof there only for d = 3. To derive the general case, one can show that a 
random (d + l)-regular graph is contiguous to a random d-regular bipartite graph plus one edge-disjoint random 
matching (following, e.g., the computations in BM86|). We would like to thank Nick Wormald for helpful private 
communications surrounding this point. 

* All operators considered here are bounded provided that the degree of vertices in T is bounded. This is the case 
in all the graphs considered in this paper. 
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(1) The adjacency operator A T on (£ 2 (V (T)) , l)Q: 

(*"/)(«) = £/(") 

If r is finite this operator is represented in the standard basis by the adjacency matrix, and 
its spectral radius is the Perron- Frobenius eigenvalue pf (T). The spectrum in this case is 

Pf(r) = Ai > a 2 > ... > A m > -pf(r), 

and the spectral gap is pf (T) — A (r), where A (T) = max{A2, — A n }0. The spectrum of Ar 
was studied in various works, for instance |Gre95| [LN98. Fri03, LP10| . 

(2) The averaging Markov operator Mr on (£ 2 (V (T)) , deg (•))[!: 

This operator is given by D r ~ 1 Ar, and its spectrum is contained in [—1, 1]. The eigenvalue 1 
corresponds to locally-constant functions when T is finite, and in this case the spectrum is 

1 = Mi > M2 > • • • > Mm > -1- 

The spectral gap is then 1 — /x (r) here /x (r) = max {^2, —Mm}- Up to a possible affine trans- 
formation, the spectrum of Mr is the same as the spectrum of the simple random walk operator 

1 l 1 /2 X /2 

{A T D~ L ) or of one of the normalized Laplacian operators (I — Ar D T or I — D T '"ArD T 1 ). 
This spectrum is considered for example in |Sin931 IChu971 IGZ99| . 

(3) The Laplacian operator Ajt on [l 2 (V (T)) , l): 

(A+/) (v)=deg(v)f(v)-J2fM 

The Laplacian equals Dr — Ar, where Dr is the diagonal operator (Drf) (v) — deg (v) ■ f (v). 
The entire spectrum is non-negative, with corresponding to locally-constant functions when 
r is finite. In the finite case, the spectrum is 

= v\ < v 2 < ■ ■ ■ < v mt 

the spectral gap being vi — v\ = The Laplacian operator is studied e.g. in |AM85| . 

For a regular graph F, all different operators are identical up to an affine shift. However, in the 
general case there is no direct connection between the three different spectra. In this paper we 
consider the spectra of Ar and of Mr ■ At this point we do not know how to extend our results to 
the Laplacian operator Ajt . 

The spectrum of all three operators is closely related to different notions of expansion in graphs. 
The adjacency operator, for example, has the following version of the expander mixing lemma: for 
every two subsets S,T C_V (T) (not necessarily disjoint), one has 



\E (5, T) pf (r) vol pf (S) vol pf (T)| < A (F) lS 



tHere, (£ 2 (V(T)),l) stands for £ 2 -functions on the set of vertices V (T) with the standard inner product: 
{/> 9) = E u / ( v ) 9 ( v ) > I n the summation Y2 W ^ V > each vertex w is repeated with multiplicity equal to the number of 
edges between v and w. 

t Occasionally, the spectral gap is taken to be pf (T) — A2 (r). 

§Here, (l? (V (T)) , deg (■)) stands for Z 2 -functions on the set of vertices V (T) with the inner product: (/, g) = 
Y.vf( v )9(v) deg 0). 
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where vol P f (S) = (lg, / P f (T)) and / P f (T) is the (normalized) Perron-Frobenius eigenf unction. This 
is particularly useful in the C ni n model since the / P f (r) is easily obtained from the Perron-Frobenius 
eigenfunction of fJ by 

/ pf (r) = 4=/ Pf (n) ° 7T. 



In the (i-regular case, this amounts to the usual mixing lemma: E (S, T) — d ' — 1 < A (r) y/\S\ ■ \T\. 

If one takes T = V \ S, one can attain a bound on the Cheeger constant of T (see (|B.ljl ). 

As for the averaging Markov operator, it is standard that /i (r) controls the speed in which a 
random walk converges to the stationary distribution. In addition, if one defines dcg (S) to denote 
the sum of degrees of the vertices in S, then 



E(S,T)- 



deg(S)deg(T) 



2\E(T)\ 

Moreover, consider the conductance of T 



</i(r) Vdeg(5)deg(T). 



0(T) 



mm 



deg(5) ■ 



Then the following version of the Cheeger inequality holds |Sin931 Lemmas 2.4, 2.6]: 

^P- < 1 - M2 < 20 (r) . 



Finally, the spectrum of the Laplacian operator is related to the standard Cheeger Constant of 
r, defined as 

h(T) = min 1^1^)1. (B.l) 

IVj 



By the so-called "discrete Cheeger inequality" [AM85j 

h 2 (r) 



2k 



<v 2 <2h (r) 



with k being the largest degree of a vertex. In addition, one has a variation on the mixing lemma 
for A+ as well |PRT121 Thm 1.4]. 
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